copyright (c) 1993 



GenCore version 5.1.6 

- 2004 Compugen ijta. 



nucleic - nucleic search, using 



model 



OM 
Run on: 



.anuary 14, 2004, 07:40:11 r^fgnU;. 



2319.58 Seconds 
9398.725 Million cell updates/sec 



US-09-864-675-3 

r/ec. .o„: - „a...=aa™«. 

Sequence; ^ ^ 

Sco.i„, ta.le= ll^^^ZT. 3.pe.t 1.0 

191S9238056 residues 
S„.ch.a, Z2781392 se<,s, 121522 ^^^^^ 

.„.al nu*e. c< .l.s =-ls£.l„, cKo=e„ pa.a»,te„. 

„l„i„u» DB ,eq lengt*,: 0^^^^^^„„„ 

Maximum DB seq xt^ny 

^^r.rT- Minimum Match 0% 
Post-processxng. Mxn. ^^^^^ ^^^^ 

Listing first 45 summarxes 



Database 



1: em_estba:* 
2: em^esthum:-*^ 
3: em_estin:* 
4: em_estmu:-*^ 
5: em_estov:* 
6: em_estpl:* 
7 : em__estro : * 
8: em__htc:* 
9: gb_estl:* 
10: gb_est2:* 
11: gb_htc:* 
12: gb_est3:-*^ 
13: gb_est4:* 
14: gb__est5:* 
15: em_estfun:* 
16: em^estom:"^ 
17: em_gss_hum:* 
18: em__gss_inv:^ 
19: em__gss_pln:^ 
20: em_gss_vrt:^ 
21: em_gss_fun:' 
2 2 : em__gs s jmam : 
23; em_gss_mus: 
24: em_gss_pro: 
25: em_gss_rod: 
26: em_gss_phg: 
27: em_gss_vrl: 



28: 
29: 



gb_gss2: 



w of results predicted by chance ^^^^""p^i^ted. 
i-'hi=^ number or resuo.^ t- _ ^-u^ result oexny f 

and is derived by anaiysi 



SUMMARIES 



Result 
No. 



% 

Query 

score Match Length^DB 



ID 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
c 33 
c 34 
35 
36 
37 
38 
c 39 
40 
41 
42 
43 
44 
45 



75.1 
63.1 
52.1 
52.1 
45.5 
44. 
40. 
35 
28 
26 



674 
565.6 
467.2 
467 
408.2 
396.6 
363.8 
318.6 
255 
234.4 
224 
210.2 
195.2 
195.2 
182.6 
169.8 
154.8 
142 
111 
107.6 
105.2 
101 
100.4 
90.2 
85 
81.2 
76.6 
67.2 
67 
66 
64.8 
64.4 
61 
60.2 
58.8 
58.4 
53 
51.6 
51.6 
50.8 
50.4 
49.8 
49.8 
49.8 
49.8 



2 
6 
.5 
.4 
.1 



25.0 
23.4 
21.8 
21.8 
20.4 
18.9 
17.3 
15.8 
12.4 
12.0 
11.7 
11.3 
11.2 



10, 
9 
9 
8 

7. 
7. 
7 . 

7. 
7 
6 
6 
6 



1 
5 
,1 
.5 
.5 
.5 
. 4 
.2 
2 
8 
7 
.6 



805 
1047 
1041 
524 
549 
412 
427 
488 
949 
333 
297 
795 
259 
327 
362 
529 
539 
657 
256 
458 
750 
493 
481 
243 
512 
167 
477 
765 
769 
751 
321 
538 
491 
322 
413 
685 
1630 
925 
982 
647 
251 
473 
658 
705 
728 



6.5 
5.9 
5.8 
5.8 
5.7 
5.6 
5.6 
5.-6 
5.6 
5.6 



12 BI918620 
12 BM914622 

12 BI412864 

13 BX281777 
9 AA706226 

9 AI041451 

10 BF108794 
4 BX529505 
12 BI410828 
10 BE983573 

9 AA772412 

12 BI651936 

10 BE648780 
9 AA968077 

13 BX089049 
9 AW476657 
9 AL918370 
13 BQ078813 
9 AW762061 

9 AI152190 
29 BZ847665 
28 BH057870 
28 AZ.987593 

10 BB570162 
9 AI073386 

9 AI836531 

10 BE984041 
12 BI522417 
12 BI413085 
29 CNS04J6G 
10 BE983721 
9 AL925790 
9 AL909688 
9 AL909689 
14 N62228 
14 CA351220 

11 AK051824 
29 CNS0091P 
13 BX415111 

12 BI960178 
9 AW045376 
12 BI666105 
14 CB059196 

12 BI662853 

13 BQ180353 



Description 

"bi918620 603176570 
BM914622 AGENCOURT 
BI412864 602988202 
BX281777 BX281777 
AA706226 ah28a07.s 
AI041451 ow36c02.s 

BF108794 7l52g03.x 
BX529505 RZPD Mus 
BI410828 602963734 
BE983573 UI-M-CGOp 
AA772412 ai44el2.s 
BI651936 603298677 
BE6487B0 UI-M-BH2 . 
^968077 uh09h01.r 
BX089049 BX089049 
AW476657 uq79e01.y 
AL918370 AL918370 

BQ078813 fy8lc06.y 
AW762061 ur53c01.y 
AI152190 udl8hl0.r 
BZ847665 CH240^239 
BH057870 RPCl-24-9 
AZ987593 2M0270P10 
BB570162 BB570162 
AI073386 ool3d06.x 
AI836531 UI-M-APO- 
BE984041 UI-M-CGOp 
31522417 603175321 
BI413085 602990205 
AL293137 Tetraodon 
BE983721 UI-M-CGOp 
^L925790 AL925790 
AL909688 AL909688 
AL909689 AL909689 
N62228 yz63c08.sl 
CA351220 622234 NC 
AK051824 Mus muscu 
AL053013 Drosophil 
BX415111 BX415111 
BI960178 HVSMEn002 
AW045376 UI-M-BHl- 
BI666105 603287281 
CB059196 NISC_jxl2 
BI662853 603286287 
BQ180353 UI-M-EXO- 



ALIGNMENTS 



RESULT 1 
BI918620 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



.1 GI:16182295 



Chordata; 
Primates; 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hoitiinidae ; Homo. 



FEATURES 

source 



BASE COUNT 
ORIGIN 



mRNA sequence. 
BI918620 
BI918620 
EST . 

Homo sapiens (human 
Homo sapiens 
Eukaryota; Metazoa; 
Mammalia; Eutheria; 

^nfactf Sle.. S..ausbe.g, Ph.D. 
Email: cgapbs-remaU^n.h gov 

Tissue Procurement. Lite i Technologies, Inc. 

cDNA Library ^^-P^^fi^^.^H^l m!a.G.E. Consortium (LLNL) 
cDNA Library Arrayed by. The i.ii- 

DNA sequencing by: .^^^f ^^.^jHlStkbutlon information can be 
Clone <ii-tributxon: MGC clone .^^^^^^ 

found through the I.M.A.i^.c^ 
http://image.llnl.gov 
SaL: LLaSi1607 row: k column: 18 
High quality sequence start: / 
High quality sequence stop: 778. 
^ Location/Qualifiers 

1. .805 

/organism="Homo sapiens 
/mol_type="mRNA" 
/db xref="taxon:9606 
/clone="IMAGE: 5240969" 
/lab_host="DHlOB" 

/clone lib="NIH_MGC_121 ^pj^_sp0RT6; Site 1: NotI ; 

/note=^Organ: brain; ^J^^o^^f anonymous pool of 3 

Site 2: EcoRV ^^^^f ^^"^^ ' "^33^5 female age 24 weeks, 
fetal b-i-' f"*^^:3"^\iLrry is otigo-d/primed and 
and male age 26 weeks Library ^^^^^ ^d upon 

directionally cloned (EcoRV site ^.^^ ^^^g^ 

cloning) . Average ^"^^^^^^^^^ - and enriched for 

0.7-3.5 kb. Library is ^^^^^^^^^ed by C. Gruber 

full-length clones ad ^^cons ^^^^^^^^ ^^^^ _ ^^.^ 

(Invitrogen) . Hesearon 
this is a NIH_MGC Li^^rary. 
a 243 c 263 g 130 t 



169 



an A' nR 12; Length 805; 
75 1%; Score 6/4, x^f ^ 

Query Match Qo'Ta. Pred. No. 4.3e-151; 

Best Local Similarity 98 7%, ^^^^ 5; Indels 5; 

Matches 732; Conservative 0, 



Gaps 



1 ATGAGGCGCGACCCGGCCCCCGGC-TTCTCCATGCTGCTCTTCGGTGT^ 59 

^ I I I I [ I I I I I I I I I I I I I 1 I I I N 1 1 1 I 1 1 I M 1 M N I I I I I M M I 11 M M I 

64 liiiiGcicGicCciiicCCCGGCGiTCXCCATGCTGCTCTTCGGTGTGTCGCTC 123 



Db 

119 

Qy 60 



Db 

QY 
Db 

QY 

Db 244 
QY 



CTACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACC^ 

.4 - i^i^;^^^^^^^ - 

120 GGGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTC^ 
.34 i^i^^i^:^^ 

180 GCCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGG 239 

::: irr=^^^^ 303 

304 iGGiiiGiiiiGciiGclGiTiATCAGCGTGGGCTCCTGTGTGCC^ 363 

^nn GCGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTG- 358 
Qy 300 GCGCTACATCTTTT , , , , , , , , , , , | , , , , , | | | 1 1 M I I I I I I I I I 

364 ^ilciiciicUiUicTiiAGCCCACGGAACAGCCCTTAGTCTTTAAGAC 423 

359 CCCCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACT 418 
Qy 359 CCCCCCTCGA A , , , , , , , , , , , , , | , | | 1 | | | | 1 | | | | | | I I I I I I 

!:iicCciiGiiiciiiciGcZiJvTCTCAAGAAAG^ 

^CTGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGG^^^^ 478 
liiGiGiiicicGGCCCAiGiiGiiG^^^^ 

AGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCC^^^ 
lGGiiGiiiiUiicic]J.CciTCAGCCGAGACATTCGCATCAAATAT 

AAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGC^ 656 
jJ^^ciiicGiciiciGTicA^^^ 

CGAGG-CCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTT^^ 715 

Mill 1 II I I I I II M I I I 11 I I I M I 1 M I I I I I I M N I Ml I I I I I I I 11 I 11 



I I I II 1 I I I M I I I I M I 1 I 



Db 


424 ( 


Qy 


419 . 


Db 


484 . 


Qy 


479 


Db 


544 


Qy 


539 


Db 


604 


Qy 


598 


Db 


664 


Qy 


657 


Db 


724 


Qy 


716 


Db 


784 



RESULT 2 
BM914622 

LOCUS BM914622 



1047 bp mRNA linear EST 12-MAR-2002 



DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGT^ISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



.1 GI:19365001 



(human) 



Craniata; Vertebrata; Euteleostomi; 
Catarrhini; Hominidae; Homo. 



FEATURES 

source 



AGENCOURT_6615334 NIH_MGC_113 Homo sapiens cDNA clone IMAGE: 5480308 
5*, mRNA sequence. 
BM914622 
BM914622. 
EST. 

Homo sapiens 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 1047) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 
Tissue Procurement: Dr. Mark Watson 
CDNA Library Preparation: Rubin Laboratory 
CDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Sonfd^stribution: SgC clone distribution information can be 
found through the I.M.A.G.E. Consortium/ LLNL at: 
http : //image . llnl . gov 
Plate: LLCM2002 row: p column: 05 
High quality sequence stop: 541. 
Location/Qualifiers 

1. .1047 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon:9606" 
/clone="IMAGE: 5480308" 
/lab_host-"DH10B (phage-resistant ) " 
/rlone lib="NIH MGC 113" 

/not^^^Organ: spleen; Vector: pOTB7; Site_l: Xhol; Site 2: 
EcoRI; CDNA made by oligo-dT priming. Directionally cloned 
into EcoRl/XhoI sites using the following 5 adaptor: 
GGCACGAG(G). Library constructed ^^^^ " ^.rnn . 

laboratory of Gerald M. Rubin (University of California, 
Berkeley) using ZAP-cDNA synthesis kit (Stratagene) and 
Superscript II RT (Life Technologies). Note: this is a 
NIH_MGC Library." 
263 a 347 c 254 g 183 t 



BASE COUNT 
ORIGIN 

Query Match 63.1%; 
Best Local Similarity 96.8%; 
Matches 577; Conservative 



Score 565.6; DB 12; 
Pred. No. 4.6e-125; 
0; Mismatches 19; 



Length 1047; 



Indels 



0; Gaps 



0; 



331 



Qy 

Db 

Qy 

Db 

Qy 



272 GCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACATCTTTTTCCT^ 

I 1 1 I I I 1 I I I I I I I 1 I 1 M M I I M 1 I I M M I 1 I I I I M I I I 1 I I 1 M I I I M I I MM 

1 GiicciGicUcCGcicG^^ 



391 



^■^9 ACCCCTTAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACCAACGGCAAAAATCTCAAGA 

iiiccUiGicUT]J.GicGGCCTTf GCCCCCCTCGATACC^^ 120 



61 



392 AAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCCGGCCCAAGTTG^^^ 451 

I I 1 1 I I I I 1 I I I I I M M I M M I M I I I M I 1 I 1 M I M I I I M I M I I I I M I I 1 I I I 



Db 


121 


AAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCCGGCCCAAGTTGAAGAAGATGA 


180 


Qy 


452 


AGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTA 

1 1 1 I M 1 M 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 i M M 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 M 1 

AGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTA 


511 


Db 


181 


240 


Qy 


512 


ATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACA 

1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 N 1 

ATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACA 


571 


Db 


241 


300 


Qy 


572 


TTCGCATCAT^TATGGCAACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGG 

1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 M 1 

TTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGACTACAGTTCT^ACAAGGTGAAGG 


631 


Db 


301 


360 


Qy 


632 


TGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCC 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 N 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

TGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCC 


691 


Db 


361 


420 


Qy 


692 


GGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

GGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCC 


751 


Db 


421 


480 


Qy 


752 


GGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCG 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 

GGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGCCTGCTACTACATCG 


811 


Db 


481 


540 


Qy 


812 


AGGGCATCAACCAGCTCTCCTGCAAGTGTCCTGTGGGATACACCGGGGACAGGTGT 867 

1 1 1 1 1 1 1 1 1 I I 1 M 1 1 1 1 1 1 1 1 M M 1 Mill II ' " ' 

AGGCCATCAATCAGCTTTCCTGCAAATGTCCCAATGGATTCTTCCGACCAACATGT 596 




Db 


541 





RESULT 3 

BI412864/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BI412864 1041 bp mRNA linear EST 14-AUG-2001 

602988202F1 NCI__CGAP_Lu33 Mus musculus cDNA clone IMAGE: 5144016 5*, 
mRNA sequence. 
BI412864 

BI412864.1 01:15173787 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murmae; Mus. 
1 (bases 1 to 1041) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail . nih . gov 
Tissue Procurement: Gilbert Smith, Ph.D. 
cDNA Library Preparation: M. Bento Soares , Ph . D. , M. Fatima 

Bonaldo, Ph.D. ,-rT^TT^ 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by:Incyte Genomics, Inc. 

Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : //image . llnl . gov 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Plate: LLAM11355 row: d column: 01 
High quality sequence start: 11 
High quality sequence stop: 645. 

Location/Qualifiers 

1. .1041 

/organism=="Mus musculus" 
/mo l_t yp e= "mRNA" 
/strain-"CZECH II" 
/db_xref-"taxon: 10090" 
/clone="IMAGE: 5144016" 
/tissue_type="pooled lung tumors" 
/lab_host="DH10B (phage-resistant ) " 
/clone lib-"NCI_CGAP_Lu33" 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; Site_l: NotI; Site_2: EcoRI; 1st 
strand cDNA was prepared from mRNA obtained from pooled 
lunq tumors with a Not I - oligo(dT) primer [5' 
TGTTACCAATCTGAAGTGGGAGCGGCCGCCTCTGTTTTTTTTTTTTTTTTT 3*] • 

Double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Not I and cloned into the Not 
I and Eco RI sites of the modified pT7T3 vector. Library 
went through one round of normalization, and was 
constructed by Bento Soares.and M. Fatima Bonaldo. " 
247 a 306 c 295 g 193 t 



Query Match 52.1%; 
Best Local Similarity 86.6%; 
Matches 563; Conservative 



Score 467.2; DB 12; Length 1041; 
Pred. No. 1.8e-101; 
0; Mismatches 78; Indels 9; Gaps 



4; 



Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



165 CAGCACCCGAGAGCCGCCCGCCTCGGGTCGGGT- GGCGTTGGTAAAGGTGCTGGACA 22 0 

II I I II III Ml II II II IN III I I I I II III "I" '111 

656 CACCTCGAGATGCCCGCCCGCCTCGGGTTCGGTTGGCGTCTTGGTGAAAGGTGCTGGACA 597 

221 AGTGGCCG— CTCCGGAGCGGGGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTG 27 8 

I I I II II I I I I I I I I I I I II I I I II I I II I I I II I II I I I I I I I I I I I I I I I I I 

596 AGTTGCCGGCTCCCGGATCGGGGGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTG 537 

27 9 TGTGCCGCTCGAAAGGAACCAGCGCTACATCTT-TTTCCTGGAGCCCACGGAACAGCCCT 337 

II llllllllMIIIIIIIIIIIMMMIII III III II II III II M III MM 

536 TGCGCCGCTCGAAAGGAACCAGCGCTACATCTTGTTTCCTGGAGCCCACCGAGCAGCCCT 477 

338 TAGTCTTTAAGACGGCCTTTGCCCCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGG 397 

II I I I II I II I I 11 II M II I I I I I M I II M I II I I II I I I I M 

TAGTTTTTAAGACAGCCTTTTGCCCCGGTCGACCCTACGGCAAATACATCAAGAAAGAGG 417 



476 

398 TGGGCAAGATCCTGTGCACTGACTGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCC 

I M M I I I II 11 II II M I M I II I 1 I I I M I I I I I II M I I II I I II II 1 II II II II 

416 TGGGCAAGATCC 



TGTGCACTGACTGCGCCACCCGGCCCAAGCTGAAGAAGATGT^GAGCC 



458 AGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCC 

MM Ml MM II I III MUM Mill NMMMMMM II M M MM 

356 AGACAGGAGAGGTGGGTG 



AGAAGCAGTCGCTCAAGTGTGAGGCAGCGGCGGGAAACCCCC 



457 



357 



517 



297 



577 



518 AGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCA 

III, II II I II I I II II M I II M II II II I M M M II II II II 11 11 II I 

296 AGCCCTCCTATCGCTGGTTCAAGGATGGCAAGGAACTCAACCGGAGTCGTGATATTCGCA 237 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 



57 8 TCAAATATGGCAACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGG 637 

I I I I I I I I I I M I I I I I I M I I I I I I M I I I I I I I I M I I I I I I I I I I I I I I M I 

236 TCAAGTATGGCAATGGCAGAAAGAACTCACGGCTACAGTTCAACAAAGTGAGGGTGGAGG 177 

638 ACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCC 697 

I II I I I I I II I I I I II I I I I I I I I I I I I II I II I I I I I I I I I I I I I I II I II I 

176 ATGCCGGGGAGTACGTCTGTGAGGCCGAGAACATCCTTGGGAAGGACACCGTGAGGGGCC 117 



698 



758 



GCAACGAGACAGCC7UVGTCCTA — TTGCGTCAATGGAGGCGTCTGCTACT 
I II I I II I I I I I I I I I I I I I II I I I I I I I I M I I I II I I I I I 
56 GCAATGAGACCGCCAAGTCCTACCATGTGTGAATGGAGGCGTGTGCTACT 7 



805 



757 



GGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGT 

I II I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I 

116 GACTCCATGTCAACAGCGTGAGCACCACTCTGTCATCCTGGTCGGGACATGCCCGGAAGT 57 



RESULT 4 
BX281777 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BX281777 524 bp mRNA linear EST 04-M7VR-2003 

BX281777 NIH_MGC_121 Homo sapiens cDNA clone IMAGp998Kl8 11607 ; 
IMAGE: 5240969, mRNA sequence. 
BX281777 

BX2 81777.1 GI: 286128 04 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 524) 

Ebert,L., Heil,0., Hennig,S., Neubert,P., Partsch,E., Peters, M. , 

Radelof,U., Schneider, D. and Korn,B. 

Human UnigeneSet - RZPD3 

Unpublished 

Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany 
RZPD; IMAGp998Kl,811607. 

RZPDLIB; I.M.A.G.E. cDNA Clone Collection; 
Human UnigeneSet - RZPD3 (RZPDLIB No. 972) 
http : / /www. rzpd.de/ CloneCards/cgi- 

bin/showLib . pi . cgi/response?libNo=972 Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 

Heubnerweg 6, D- 14 059 Berlin, Germany 

Tel: +49 30 32639 101 

Fax: +49 30 32639 111 

www.rzpd.de 

This clone is available royalty-free from RZPD; 

contact RZPD {clone@rzpd.de) for further information. Seq primer: 
M13u, Primer sequence: CGTTGTAAAACGACGGCCAGT . 

Location/Qualifiers 

1. .524 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref-"taxon: 9606" 

/clone="IMAGp998K1811607 ; IMAGE: 5240969" 



/lab_host="DH10B" 
/clone__lib-"NIH_MGC_121" 

/note^"Organ: brain; Vector: pCMV-SPORT6; Site_l: NotI; 
Site 2: EcoRV (destroyed); RNA source anonymous pool of 3 
fetal brains^ female age 20 weeks, female age 24 weeks, 
and male age 2 6 weeks. Library is oligo-dT primed and 
directionally cloned (EcoRV site is destroyed upon 
cloning). Average insert size 1.7 kb, insert size range 
0.7-3.5 kb. Library is normalized and enriched for 
full-length clones and was constructed by C. Gruber 
(Invitrogen) . Research Genetics tracking code 017. Note: 
this is a NIH_MGC Library." 

BASE COUNT 99 a 174 c 172 g 79 t 

ORIGIN 



Query Match 52.1%; Score 467; DB 13; Length 524; 

Best Local Similarity 100.0%; Pred. No. 1.6e-101; 

Matches 467; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGG i Gi Gi CGu i CCjUU i 


ou 




, , II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ) 1 1 1 1 1 1 1 1 1 r 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M M M 1 M 1 1 1 t I 1 N M 




Db 


58 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


117 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


ion 




1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M M M 1 1 1 1 M 1 N 1 1 N 1 1 1 M M 




Db 


118 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGl GCji bbAb 


All 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


178 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCTiACAGCACCCGAGAGCCG 


237 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 




1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 M 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 




Db 


238 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


297 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 




1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 




Db 


298 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


357 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 




1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


358 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


417 


Qy 


361 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


420 




1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 




Db 


418 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


477 


Qy 


421 


TGCGCCACCCGGCCCAAGTTGAAGAAGATGT^GAGCCAGACGGGACA 4 67 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


478 


TGCGCCACCCGGCCCAAGTTGAAG7VAGATGAAGAGCCAGACGGGACA 524 





RESULT 5 
AA706226 

LOCUS AA706226 549 bp mRNA linear EST 12-JAN-1999 

DEFINITION ah28a07.sl Soares_parathyroid_tumor_NbHPA Homo sapiens cDNA clone 
1240116 3' similar to TR:P43328 P43328 NEU DIFFERENTIATION FACTOR 



NDF04 roRNA sequence. 
AA706226 

AA706226.1 GI:2716144 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 549) 

NCI-CGAP http : //www . ncbi , nlm. nih . gov/ ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 
Tumor Gene Index 
Unpublished 

Contact; Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima Bonaldo 
, Ph.D. 

cDNA Library Arrayed by: Greg Lennon, Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
www-bio . llnl . gov/bbrp/image/ image . html 

Possible reversed clone: similarity on wrong strand 
Possible reversed clone: polyT not found 
Insert Length: 689 Std Error: 0.00 
Seq primer: -40ml3 fwd. ET from Amersham 
High quality sequence stop: 451. 
FEATURES Location/Qualifiers 
source 1. .549 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="1240116" 

/tissue_type=="parathyroid tumor" 
/ de v_s t age= " adul t " 

/lab_host="DH10B (ampicillin resistant)" 
/clone_lib="Soares_parathyroid_tumor_NbHPA" 

/note=""Organ: parathyroid gland; Vector: pT7T3D (Pharmacia 
) with a modified polylinker; Site_l: Not I; Site_2 : Eco 
RI; 1st strand cDNA was primed with a Not I - oligo(dT) 
primer 

[ 5 ' -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3']f double-stranded cDNA was size selected, ligated 
to Eco RI adapters (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia) . Library went through one round of 
normalization to a Cot = 5. Library constructed by Bento 
Soares and M. Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH. " 

BASE COUNT 137 a 163 c 156 g 92 t 1 others 

ORIGIN 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



Query Match 45.5%; Score 408.2; DB 9; Length 549; 

Best Local Similarity 91.7%; Pred. No. 2.1e-87; 



Matches 431; Conservative 0; Mismatches 39; Indels 



0; Gaps 



0; 



Qy 424 GCCACCCGGCCCAAGTTGT^AGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 48 3 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 15 GCCACCCGGCCCAAGTTGAAGAAGATG7\AGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 74 

Qy 4 84 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 54 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 75 TCGCTGAAGTGTGAGGCAGCAGCGGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 134 

Qy 544 GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAAC 603 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 135 GGCAAGGAGCTC7VACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG7WVGAAC 194 

Qy 604 TCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCC 663 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 195 TCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCC 254 

Qy 664 GAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACC 723 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 255 GAGAACATCCTGGGGAAGGACACCGTCGGAGGCCGGCTTTACGTCAACAGCGTGACGACC 314 

Qy 724 ACCCTGTCATCCTGGTCGGGGCACGCCCGG7\AGTGCAACGAGACAGCCAAGTCCTATTGC 783 

I I I I I I II II I I I II I I II I I I I I I I I I I I I II I I I II I I I I I M I I M II II I I II I I 
Db 315 ACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGNGACAGCCAAGTCCTATTGC 374 

Qy 7 84 GTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCCT 84 3 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I III 

Db 375 GTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCT^CCAGCTCTCCTGCAAGGCACCT 434 

Qy 844 GTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 8 93 

II III II II I I I I II I II I I 

Db 435 GGGCTGCACTGCTTAGTVACTTGGTACCCAGAGCCACCACTTCCCCATCTC 484 



RESULT 6 
AI041451 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

0RG7\NISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AI041451 412 bp mRNA linear EST 28-AUG-1998 

ow36c02.sl Soares_parathyroid_tuinor_NbHPA Homo sapiens cDNA clone 
IMAGE: 1648898 3' similar to TR:014511 014511 NTAK. ;, mRNA 
sequence . 
AI041451 

AI041451.1 GI:3280645 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 412) 

NCI-CGAP http : //www . ncbi . nlm. nih . gov/ncicgap . 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail . nih , gov 

cDNA Library Preparation: M. Bento Scares, Ph.D., M. Fatima Bonaldo 
, Ph.D. 



cDNA Library Arrayed by: Greg Lennon^ Ph.D. 

DNA Sequencing by: Washington University Genome Sequencing Center 
Clone distribution: NCI-CGAP clone distribution information can be 

found through the I.M.A.G-E. Consortium/LLNL at: 

www-bio . llnl . gov/bbrp/ image/ image . html 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Trace considered overall poor quality 
Insert Length: 671 Std Error: 0.00 
Seq primer: -4 0ml3 fwd. ET from 7\mersham 
High quality sequence stop: 1. 

Location/Qualifiers 

1. .412 

/organism="Homo sapiens" 

/mo l_t yp e= "mRNA " 

/db_xref-"taxon:9606" 

/clone="IMAGE: 1648898" 

/ tissue_type-"parathyroid tumor" 

/dev_stage="adult" 

/lab_host="DH10B (ampicillin resistant)" 

/ clone_lib~"Soares_parathyroid_tumor_NbHPA" 

/note="Organ: parathyroid gland; Vector: pT7T3D (Pharmacia 
) with a modified polylinker; Site_l: Not I; Site_2 : Eco 
RI; 1st strand cDNA was primed with a Not I - oligo(dT) 
primer 

[5 * -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
TTTTT-3'], double-stranded cDNA was size selected, ligated 
to Eco RI adapters (Pharmacia) , digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia) . Library went through one round of 
normalization to a Cot ~ 5. Library constructed by Bento 
Soares and M.Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 
NIH. " 

112 a 108 c 126 g 65 t 1 others 



Query Match 44.2%; 
Best Local Similarity 97.6%; 
Matches 4 02 ; Conservative 



Score 396.6; DB 9; 
Pred. No. 1.2e-84; 
0; Mismatches 10; 



Length 412; 
Indels 0; 



Gaps 



0; 



Qy 

Db 



426 CACCCGGCCCAAGTTGTWVGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATC 485 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 
1 CACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATC 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 



486 GCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGG 545 
I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

61 GCTG7\AGTGTGAGGCAGCAGCGATAAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGG 12 0 

546 CAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAACTC 605 

. I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 CAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAACTC 18 0 

606 ACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGA 665 

I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
181 ACGACTACAGTTCAACAAGGTG7\AGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGA 24 0 



Qy 666 GAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCAC 725 

I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I II II I 
241 GAACATCCTGGGGAAGGACACCGTACGAGGCCGGCTTTACGTCAACAGCGTGACGACCAC 300 

Qy 726 CCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGT 785 

M I I I I M I II I I II I I I I I I I I I I I I II I II I I I I I I I I I I I I I I II M I I I I I I M 
Db 301 CCTGTCATCCTGGTCGGGGCACGCCGGGAAGTGCAACGNGACAGCCAAGTCCTATTGCGT 360 

Qy 786 CAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAG 837 

I II I I II II I I I II I I I II I I I II II I I I I II I I II I I I II I I I I I I I I I I I 
Db 361 CAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAG 412 



RESULT 7 
BF108794 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



BF108794 427 bp itiRNA linear EST 20-OCT-2000 

7152g03,xl Soares_NSF_F8_9W_OT_PA_P_Sl Homo sapiens cDNA clone 
IMAGE: 3525292 3' similar to SW : NTAK_HUMAN 014511 NTAK PROTEIN 
/contains MSRl.tl MSRl repetitive element mRNA sequence. 
BF108794 

BF108794.1 GI:10938484 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 427) 

NCI-CGAP http://www.ncbi.nlm.nih.gov/ncicgap. 

National Cancer Institute, Cancer Genome Anatomy Project (CGAP) , 

Tumor Gene Index 

Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r @mail . nih . gov 

This clone is available royalty-free through LLNL ; contact the 
IMAGE Consortium (info@image.llnl.gov) for further information. 
Seq primer: -4 0UP from Gibco 
High quality sequence stop: 396. 

Location/Qualif iers 

1. .427 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 3525292" 
/lab_host="DH10B" 

/ clone_lib= " Soares_NS F__F8_9W_0T_PA_P_S 1 " 

/note="0rgan: pooled; Vector: pT7T3D-Pac (Pharmacia) with 
a modified polylinker; Site_l: Not I; Site_2 : Eco RI; 
Equal amounts of plasmid DNA from five normalized 
libraries were mixed, and ss circles were made in vitro. 
Following HAP purification, this DNA was used as tracer in 
a subtractive hybridization reaction. The driver was 
PGR- amplified cDNAs from pools of 5,000 clones made from 
the same 5 libraries. The pools consisted of the following 
libraries and clonelDs: Soares NbHSF pool 1: 
309384-310919, 323208-325895 Soares Nb2HP pool 1: 
145032-147335, 147720-148103, 148872-149255, 15002 - 
150407, 151176-152327 Soares Nb2HF8-9W pool 1: 



758280-760583, 772104-774407 Scares NbHPA pool 1- 
304776-306311, 320136-322823, 326280-326663 Scares NbHOT 
pool 1: 723720-726407, 739080-74 0999 Subtraction by Bento 
Scares and M. Fatima Bcnaldo." 

BASE COUNT 114 a 112 c 145 g 56 t 

ORIGIN 

Query Match 40.6%; Score 363.8; DBIO; Length 427; 

Best Local Similarity 91.3%; Pred. No. 8.5e-77- 

Matches 386; Conservative 0; Mismatches 37; Indels 0; Gaps 0; 
Qy 363 CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG 422 



LlL_i.JJ_ ' I ' I III II I I II I I 

64 



Db 5 ccgcggcaagaagcacccagaggggaggaagcgggagagggagcccgatcccggggAgaa 



Db 65 



Qy 423 cgccacccggcccaagttgaagaagatgaagagccagacgggacagot 482 

""""" I > 1 11 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 11 II 

AGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 124 

Qy 483 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

nK ,,''I""II"I"IIIIII|||||||||||||MIII||||| III Mill lllilllll 

Db 125 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCciACciiTiGiicJJ.Gil 184 

Qy 543 TGGCAAGGAGCTCAACC^ 

nh ''""""""ll"ll'llllllllllllilllllllllllll|iri|||||llll 

Db 185 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGiAGAAA^ 244 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

nK o.. i.i""""""""""l"ll"l"NIIII||illlllllllllllllllll 

Db 245 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGiiiiTciGCGAGGi 304 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGT 722 

''"""""l"l"IIIIIIIM|llllllllllll||||||||||llllllllll 

Db 305 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGicAACAGCGTGAGC^^ 364 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

nn 'i I I I I I I I I I 11 I II I I I I I I 11 I I I I I I I I I I II I I I I I I 

Db 365 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGAcAGicJ^iicciiiiG 424 

Qy 783 CGT 785 

III 

Db 425 CGT 427 

RESULT 8 
BX529505 

ID BX529505 standard; RNA; EST; 488 BP 
XX ' 

AC BX529505; 
XX 

SV BX529505.1 
XX 

DT 27-MAY-2003 (Rel. 75, Created) 

DT 27-MAY-2003 {Rel. 75, Last updated. Version 1) 
XX 

DE RZPD Mas musculus cDNA clone IMAGp998N017639 = IMAGE: 3153984 5' EST. 



KW 
XX 



EST; expressed sequence tag. 



OS Mus musculus (house mouse) 



OC 

oc 

XX 

RN [1] 



Eukaryota; Metazoa; Chordai-^ • r^^ • ^ 



RP 1-488 

RA Heil O., Ebert L., Neuberh p d ^ 

" Korn B.; P-ters M. , Radelcf U., Schneider D., 



Feld 580, D-69120 Heidelbe.g, G^rl^^J ^^encforschung G„^„ Heuenhei.er 
R2:PD; IMAGp998N017639 

RZPD Deutsch., p.. _ 



Location/Qualifiers 



RA 
RT 
RL 
RL 
RL 
XX 
CC 
CC 
CC 
CC 

CC _^ 

CC Tel: +49 30 32639 101 ' ^"^^"^ 

CC Fax: +4 9 30 32639 111 
CC www.rzpd.de 
CC 
CC 
CC 
XX 
FH 
FH 

^ 1. .488 

FT /<^^_xref="taxon: 10090" 

FT /note="Cloned unidirectiona 1 1 ,r u ■ 

FT 2 kb. Library consTrli:!'/^^"'"' I ^^^^^e 

fJ ^^talog #12017-018. investlaf^ ^ ""^^^ Technologies, 

fJ «-"-ighausen/chu-Xia ol^g NiS Rf/"""'^'"^ "^"^P^^- ^-^thar 

^ model: Xu et al., Natur^^Genetics sT^'v 4^ m o^"^^^"^^ 

this is a NCI_CGAP Library ' (1999). Note: 

^ /^^^P'/'''^-"^'^i-nlm.nih.gov/ncicqaD/> 

/organism="Mus musculus" '"^i'^g^P/>. 

/clone="IMAGp998N017639" 

3Q sequence 488 BP, 12, a, ^6 c, US 0, 9. T; 0 othe.; 

Query Match o 

Best Local Similarity sill Sd" f'l''. ^""^^^^ ^88; 

1 GTGGGTGAGAAGCAGTCGCTCAAGTGiiirii! ' ' II ' IJ ' " ^ I I I | | 1, , , , 



Db 

Qy 



, I ' I I I I I I I I I ,1,1, 11,11,,, , y--';'^>^'i'CCCCAGCCTTCCTAC 528 

?'=m==^^ .8 



Db 


61 


Qy 


589 


Db 


121 


Qy 


649 


Db 


181 


Qy 


709 


Db 


241 


Qy 


769 


Db 


301 


Qy 


829 


Db 


361 



CGCTGGTTCAAGGATGGCAAGGAACTCAACCGGAGTCGTGATATTCGCATCAAGTATGGC 120 
AACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAG 64 8 

II INIIIIIIMIMNI MIMIIIIIllli MM IIMIIIM || ||||m 

AATGGCAGAAAGAACTCACGGCTACAGTTCAACAAAGTGAGGGTGGAGGATGCCGGGGAG 180 
TATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC 7 0 8 

" Kill IIIIMIIIIIIMIII IMMMIIIMII Mil III M I |M 

TACGTCTGTGAGGCCGAGAACATCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTC 24 0 
AACAGCGTGAGC^^ 768 

IIKIIIIIIIIIIIII IIMIIIIIMMMM II IIMIIIIIIIIII Mill 

AACAGCGTGAGCACCACTCTGTCATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACC 300 

GCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCA^^ 82 8 

II I I I I I II II II II II I I I M I I I II II II II I I I I II II M II M I II II 
GCCAAGTCCTACTGTGTGAATGGAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTC 360 

TCCTGCAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 
I I I I I I I I II I I I I II I I I I I 1 I I II I III 



RESULT 9 

BI410828/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



IntlilnL^n ..r-r ""^^ linear EST 14-AUG-2001 

602963734F1 NCI_CGAP_Lu33 Mus musculus cDNA clone IMAGE : 5119065 5' 

mRNA sequence. ' 
BI410828 

BI410828.1 GI:15171751 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
r^TbaiJi rto^949; Sciurognathi; Muridae; Murinae; Mus. 

NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r@mail . nih . gov 

Tissue Procurement; Gilbert Smith, Ph.D. 

cDNA Library Preparation: M. Bento Soares, Ph.D., M. Fatima 
Bonaldo, Ph.D. 

CDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing byrlncyte Genomics, Inc. 

fo.'inr.f '''''^''1^^^''"^ NCI-CGAP clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : / / image . llnl . gov 



Plate: LLAM11290 row: d column: 
High quality sequence start: 28 
High quality sequence stop: 919. 

Location/Qualifiers 

1. .949 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="CZECH II" 



10 



/db_xref="taxon: 10090" 
/clone=" IMAGE: 5119065" 
/tissue_type="pooled lung tumors" 
/lab_host="DH10B (phage-resistant) " 
/ cl one_l ib= "NCI_CGAP_Lu3 3 " 

/note="Organ: lung; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylmker; Site_l: NotI; site 2: EcoRI; 1st 
strand cDNA was prepared from mRNA obtained from pooled 
lung tumors with a Not I - oligo(dT) primer [5- 

TGTTACCAATCTGAAGTGGGAGCGGCCGCCTCTGTTTTTTTTTTTTTTTTT 3 ■ 1 
Double-stranded CDNA was ligated to Eco RI adaptors 
(Pharmacia) digested with Not I and cloned into the Not 

went tf ° ; °' ^^'^^ modified pT7T3 vector. Library 

went through one round of normalization, and was 

171 a'^^^'erc''' 269T'° fjrr ^^"'"'^ " 

ORIGIN ^ ^ 



Qy 

Db 

Qy 457 
Db 



Db 

Qy 571 



Matches 388, Conservative 0; Mismatches 85, Indels 12, saps 7, 

8.0 cUiG^i.G.i;c^^^lilic.G4i;i;c;;iii-ii-,,,.',i'^^ 83i 
830 CCCC.CciiiciUcGGiGi.i;i-^iiiLLlii^^^^^ 

770 "TCGCATCAAGTATGC«ATGGCAG7W.GAACTa«Gra«Gii,!iiiI^^ 711 
631 =y--=™GGACGCTGGGC»GTATGTC^ 

7io 

689 «CGGGGCCGGCTTTACGT«A«^^^^ 

eso i--GA^iiii^i;Gci.iiiXiii;- -i-,-^^^^^^ 

749 ^^^^^^^^^^^l^'^eAGACAC^^^ ^ 

531 TCGAGGGCATGAAGCAGCTCTCCTGCAAATGTi<!MACGGAiTCTTciLciGiL^^^^ 472 

Qy 869 AGCAG 873 

I I I 
471 TGGAG 4 67 



Db 

Qy 

Db 

Qy 

Db 

Qy 749 



RESULT 10 

BE983573 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



BE983573 -s-^jo ^r. ^omtx 

UI-M-CGOp-bgi-c-07-O-UI.sl NIH'^BMApTt4 S2 Muf" 29-APR-2002 
UI-M-CGOp-bgi-c-07-O-UI 3', ^S^A^equanci 

BE983573 

BE983573.1 GI:10654893 
EST. 

Mus mus cuius (house mouse) 
Mus mus cuius 

Eukaryota; Metazoa; choidata; Craniata; Vertebrata; Euteleostomi- 

TTbaL'; T^^i -'""^""^i- Murida;, hTA~.;. 

Bonaldo,M.F., Lennon,G. and Soares,M.B. 

dL'Siery"'"" '"^ subtraction: two approaches to facilitate gene 

Genome Res. 6 (9), 791-806 (1996) 

97044477 

8889548 

Contact: Chin, H 

National Institute of Mental Health 

2'0892-9643r^SA^'"''' """^ ''''' ^^^h-^-' 

Tel: 301 443 1706 
Fax: 301 443 9890 
Email: mEST@mail.nih.gov 

is'Skely^^ternL' iTlt' '^^^ ' ^''^ beginning of .eguence 

Soares llh r^o i- ! ^ message. cDNA Library Preparation: M.B. 
Soares Lab Clone distribution: Researchers may obtain BMAP cDNA 
clones from RESEARCH GENETICS. It should be noted thatTnto Soares 
xs generating a small number of additional specialized 
non-redundant arrays of BMAP cDNAs whose availability will be 

rZ'tTslT. foft^ri-r'^'^ collaboratiie arrangements 

ine tissue for this library was contributed by Dr. Xin-Yuan Fu 
Yale University School of Medicine The following repetiSle 
elements were found in this cDNA sequence: 15-105 

^GC_r i ch #Low_compl exi t y ' 
Seq primer: M13 Forward 
POLYA=No . 

Location/Qualifiers 
1. .333 

/organism="Mus musculus" 

/mol_type="mRNA" 

/ strain="C57BL/6J" 

/ db_xr e f = " t axon : 1 0 0 9 0 " 

/clone="UI-M-CG0p-bgi-c-07-0-UI" 

/lab__host="DH10B (Life Technologies)" 

/clone_lib="NIH_BMAP Ret4 S2" 

inf.r'T''''°l' P"'^35-Pac-(Pharmacia) with a modified 
polylinker; Site_l: Not I; Site 2: Eco RI ; The 
NIH_BMAP_Ret4_S2 library is a subtracted library, 
ultimately derived from mouse retina tissue libraries at 

°/ development. For a detailed description 
of the library from which this clone was derived, please 
visit our web site at brainest.eng.uiowa.edu. Th; ^issuJ 



for this library was contributed by Dr. Xin-Yuan Fu v^i ^ 
University School of Medicine ' 



TAG_SEQ=None found" 
BASE COUNT 47 a 124 c 122 g 40 t 

ORIGIN ^ ^ 



Matches 241; Conae.vativ, 0; msmtches ll; Indels 0; Gaps 0; 



Qy 


1 


Db 


82 


Qy 


61 


Db 


142 


Qy 


121 


Db 


202 


Qy 


181 


Db 


262 


Qy 


241 


Db 


322 



CTCGCCTGC 60 



1 M I M 1, , M M Ml n I m m r, I 7 m rTn n n n n n 



'''''' I ' I I N I I I I I I III II I I I I I I I II II II I I I 11 I II I 

GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTicciGCTCT^ciGcici 



:CGAGAGCCG 261 



N I I I I I M I M I M , I I I M ,1 I N 11 M M Mm 

CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGicIiiii^^ 321 



I I 



RESULT 11 

AA772412 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AA772412 oqt k 

m?or:/„™ne^;»L™"""' "™t™io» fact™ 

AA772412 

AA772412.1 GI:2824195 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata • Fnt^l»^=^ • 
Mammalia; Eutheria; Primates- Catarrhinn h ^ Euteleostomi; 
1 (bases 1 to 297) ^atarrhini; Hominidae; Homo. 

NCI-CGAP http : //www. ncbi . nlm. nih . gov/ncicgap . 

"^or^^Gine'^nd:.'"^"''^^^' '^"'^^^ ^^"^^^ ^^^^^ (-AP, , 

Unpublished 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

CDNA Library Preparation: M. Bento Scares, Ph.D., M. Fatima Bonaldo 
cDNA Library Arrayed by: Greg Lennon, Ph.D 



^-bio . llnl . gov/fobrp/image/image . html 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Pos^thl^ "^""^^ "^"^^^ similarity on wrong strand 
Possible reversed clone: polyT not found 
Insert Length: 667 Std Error: 0 00 
Seq primer: -40ml3 fwd. ET from Amersham 
High quality sequence stop: 267. 

Location/Qualifiers 
1. .297 

/organism="Homo sapiens" 
/ mol_type= "mRNA" 
/db_xref="t axon: 9606" 
/clone="1359886" 

/tissue_type="parathyroid tumor" 
/ dev_stage="adult " 

/lab_host="DH10B (ampicillin resistant)" 
/clone__lib="Soares_parathyroid_tumor NbHPA" 
/note-"Organ: parathyroid gland; Vector- oT7T^n fj^y.^^ 
) With a modified polylinker; site 1: ^ot^l 's^ti 2 E^' 
^'""^ ^^'^^ P^^""^^ a Mot l'- olxgoidT) 

TTTTT-3 ] , double-stranded cDNA was size selected liaat^H 
to ECO RI adapters (Pharmacia), digested with Not' I and 

87 a 68 c 105 g 37 t 



Matches 236.. Conservative O; Mismatches' 'li,. xndels 0, .,ps o, 
GATCCTC;TG«CTC«CTGCGcmcCC^ ^ 



Qy 


405 


Db 


42 


Qy 


465 


Db 


102 


Qy 


525 


Db 


162 


Qy 


585 


Db 


222 


Qy 


645 


Db 


282 



C,ACCGTTGGTTCAAGGATG^ 



I I M I I I I I I I I I I I I 



RESULT 12 

BI651936 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



BI651936 79c- . 

603298677F1 NCI CGAP Mam3 Mn^ "^^^ linear EST 12-SEP-200 

mRNA sequence. musculus cDNA clone IMAGE : 5339251 5' 

BI651936 

BI651936.1 GI:15566172 
EST. 

Mus musculus (house mouse) 
Mus musculus 

1 (basas 1 to 795) '■°'"'"'="- S<=i"t=gn«thi; Mutidae; Mutinae; Mus. 
NIH-MGC http://»,gc. nci.nih.gov/. 

SnpZL\eS"""" °' collection ,M=C, 

Contact: Robert Strausberg, Ph D 
Email: cgapbs-remail.nih.gov 

IE sF"F-"°-" "-"?ecn=:i"og-,.";-,„--- 

Clone distribution: MGC clone distri h1,^,• • * 
found through the I M A g V rr.T I , information can be 
httr,-//s,„= 1. n -^•^•^•^•E. Consortium/LLNL at: 
nccp . / /xmage . llnl . gov 

Plate: LIAM11861 row: j column: 20 
High quality sequence stop: 795. 

Location/Qualifiers 
1. .795 

/organism="Mus musculus" 
/mol_t ype= "mRNA" 

/strain="129, C57BL/6J, FVB/N" 
/ db_xr e f = " t axon : 1 0 0 9 0 " 
/ clone="IMAGE: 5339251" 
/tissue_type="tumor, gross tissue" 
/dev_stage="10 months" 
/lab_host="DH10B" 
/clone_lib="NCI_CGAP_Mam3 " 

22, 37-43 (1999) - "'°aej. . xu et al.. Nature Genetics 

204 a 226 c 219'g 145 t 



Query Match 23.4%- 
Best Local Similarity 862%' 
Matches 244; Conservative 



Score 210.2; DB 12; 
Pred. No. 7e-40; 
0; Mismatches 38; 



Qy 



Db 



592 GGCAGAAAGAACTCACGAC- 

I I M I i I I I I I I I I I I I I I,,, I, 

1 GGCAGAAAGAACTCACGGCTTACAGTT 



Length 795; 



Indels 



1 ; Gaps 



1; 



TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTA 650 
I'll MIIIIIM II 11111111 

CAACAAAGTGAGGGTGGAGGATGCCGGGGAGTA 60 



Qy 


651 


Db 


61 


Qy 


711 


Db 


121 


Qy 


771 


Db 


181 


Qy 


831 


Db 


241 



CAAGTCCTATTGCGTCAATGGAGG 



CGTCTGCTACTACATCGAGGGCATCAACCAGCTCTC 830 

lll'IIIIIIMliMMIIllMiiMiMI 

GTGCTACTACATCGAGGGCATCAACCAGCTCTC 24 0 



CTGCAAGTGTCC^ 

'1 I I I I I I I I I I I I III I I I I II I , , 

CTGCAAATGTCCAAACGGATTCTTCGGACAGAGATGTTTGGAG 283 



RESULT 13 

BE648780 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



FEATURES 

source 



BE648780 otrq , 

BE648780.1 GI:9974601 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata- Eu^.^ . • 
Mammalia; Eutheria- RoHf^r,i-i = . c; ■ "erteorata, Euteleostomi; 

1 (base; 1 to 259)' ^cxurognathi ; Muridae; Murinae; Mus. 

Bonaldo,M.F., Lennon,G. and Soares,M.B. 

S^ierr'^'^ subtraction: two approaches to facilitate gene 

Genome Res. 6 (9), 791-806 (1996) 
97044477 i^^^o; 

8889548 

Contact: Chin, H 

National Institute of Mental Health 

6001 Executive Blvd. Room 7N-7190 MSC 964^ Ro^k ^ 
20892-9643, USA yb43, Bethesda, MD 

Tel: 301 443 1706 

Fax: 301 443 9890 

Email: inEST@mail.nih.gov 

Ss;ariL™^L%"^^ -J — distribution: 

should be noted that Bent?% RESEARCH GENETICS. It 

additional spJciaMzed non generating a small number" of 

availability'^Sf brconsSerfr °' -hose 

collaborative arLn^em^nts appropriate and limited 

Seq primer: M13 Reverse. 

Location/Qualifiers 
1. .259 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6J" 
/db_xref="taxon: 10090" 
/clone-"UI-M-BH2 . 2-aop-b-12-0-UI " 



BASE COUNT 
ORIGIN 



/dev_stage="27-32 days" 
/lab_host-"DH10B (Life Technologies)" 
/ clone__lib="NIH_BMAP M S3 2" 

»IH^B,«P „ S3.2 library is a subtracted llbrX of a 

S";-'-" — ^^^^^ 

gj-and, striatum, hipoccampus ) after a series of 

iTa^EsiT^^: ^r'"^^ representation o? ^DNAs fro. 

Which ESTs had already been generated. Thefollowina 
serially subtracted libraries were generated in tSs 

^L^^rub^r^ctedTb^-^'-^ ^^«--^^X32, NIHbSap'S^I. 

ixie suDCracted library (NIH BMAP m ~ 

as follows: PCHa.plifLd cDNHnselt " f o/ni„^b;;JpTsT 
clones from which 3' ESTs had been derived was^u^d- Is a 

in the form of single-stranded circles. The~remai ni n^^ 
bacteria (li^l^ circles and electroporated into DHIOB 

^1 a 69 c 75 g 54 t 



Query Match 9i pa. o 

Best Local Similarity lo.H', TrlT ll^'lUl^ Length 259; 

Hatches 209; Conservative 0; M^sL^^che^^^'^S; Xndels 0; Caps 

LkkckilcTrLL^^^^^^^^ I 'I II I IN III Nil, 



TGTGGGATarar-r-ri^T.r.-^^:.,, '11' 'I I'' N II I I II 



0; 



Qy 


663 


Db 


1 


Qy 


723 


Db 


61 


Qy 


783 


Db 


121 


Qy 


843 


Db 


181 



RESULT 14 

AA968077 

LOCUS 

DEFINITION 



AA968077 -^o-i k 

uh09h01.rl Soares mouse h.^lf-^^ . ^^''^^'= 19-MAY-1998 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AA968077 

AA968077.1 GI:3141970 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Manuual.a; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 

\-Dases X to S 2.1 ) 

Marra M Hillier, L. , Allen,M. , Bowles, M., Dietrich,N., Dubuque,T., 
Gexsel,S Kucaba,T., Lacy,M., Le,M., Martin,J., Morri;,M., 
Schellenberg K Steptoe,M., Tan,F., Underwood, K. , Moore, B, 

JatS^^^^R: • 

The WashU-HHMI Mouse EST Project 
Unpublished 

Contact: Marra M/Mouse EST Project 

WashU-HHMI Mouse EST Project 

Washington University School of MedicineP 

Fax: 314 286 1810 

Email: mouseest@watsbn.wustl.edu 

TMAr/i°''^ is available royalty-free through LLNL ; contact the 
^2??95675r (^nfo@image. llnl.gov) for further information. 

Seq primer: -28ml3 rev2 ET from Amersham 
High quality sequence stop: 318. 

Location/Qualifiers 

1. .327 

/organism="Mus musculus" 

/ mol_t ype= "mRNA" 

/db_xref="taxon: 10090" 

/clone="IMAGE: 1617457" 

/ tis sue_type="hypothalamus " 

/lab_host="DH10B" 

/clone_lib-"Soares mouse hypothalamus NMHy" 

/note=^"Organ: brain; Vector: pT7T3D-Pac (Pharmacia) with a 
modified polylinker; Site_l : Not I; Site_2: Eco RI; 1st 
strand cDNA was primed with a Not I - oligo(dT) primer [5' 

TGTTACCAATCTGAAGTGGGAGCGGCCGCCAAGGTTTTTTTTTTTTTTTTTTTTTTTT 
T 3^; double-stranded cDNA was ligated to Eco RI adaptors 
(Pharmacia), digested with Not I and cloned into the Not I 
and Eco RI sites of the modified pT7T3 vector. RNA 
provided by Dr. Wolfgang Liedtke. Library went through 
two rounds of normalization, and was constructed by Bento 
Soares and M.Fatima Bonaldo." 
81 a 81 c 99 g 66 t 



Query Match 21.8%; 
Best Local Similarity 90.2%; 
Matches 231; Conservative 



Score 195.2; DB 9; 
Pred. No. 2.1e-36; 
0; Mismatches 23; 



Length 327; 



Indels 



2; Gaps 



2; 



Qy 

Db 



641 



CTGGGGAGTATGTC^ 700 
2 CCGGGGAGTACGTCTGTGAGGCCCAGAACATCCTTGGGAAGGACACCGTGAGGGA-CGAC 60 



Qy 


701 


Db 


61 


Qy 


7 61 


Db 


120 


Qy 


821 


Db 


180 


Qy 


881 


Db 


240 



TTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGG^^^ 760 

1^ ' I I I I I I I I I I I I I I Mill I M I M I I I I I I I M M M I I I I I M I I I I I I 

TCCATGTCAACAGCGTGAG-ACCACTGTGTCATCCTGGTCGGGACATGCCCGGAAGTGC^ 119 

ACGAGACAGCCAAGTCCT^^ 820 
I I I I I I I I I I I I I I I I I M II I I M I I I I I II I I I II M I I I II II n II li M 
ATGAGACCGCCAAGTCCTACTGTGTGAATGGAGGCGTGTGCTACTACATCGAGGGC^^ 179 

ACCAGCTCTCCT 

'''''''''''> I I I I I I I I I I I I I I I II M I I I I II I I I I I I I I I I I I I I I I M II II I 

ACCAGCTCTCCTGCAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTC^ 239 



J I I I I I I I I I I I I I 



RESULT 15 

BX089049/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



llollntl . . li^^^^ EST 23-JAN-2003 

BX089049 Soares_parathyroid tumor NbHPA Homo sapiens cDNA clone 
IMAGP998M133119 ; IMAGE : 124 0116, rr^RNA sequence. 
BX089049 ^ 

BX089049.1 GI:27825909 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutherxa; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 362) 

Ebert,L , Heil,0., Hennig,S., Neubert,P., Partsch,E., Peters, M 
Radelof^U., Schneider, D. and Korn,B. ers,^.. 
Human UnigeneSet - RZPD3 
Unpublished 
Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 
Im Neuenheimer Feld 580, D-69120 Heidelberg, Germany 
RZPD; IMAGp998M133119. ^ 
RZPDLIB; I.M.A.G.E. cDNA Clone Collection; 
Human UnigeneSet - RZPD3 (RZPDLIB No. 972) 
http: //www, rzpd.de/CloneCards/cgi- 

bin/showLib.pl.cgi/response?libNo=972 Contact: Ina Rolfs 

RZPD Deutsches Ressourcenzentrum fuer Genomf orschung GmbH 

Heubnerweg 6, D-14059 Berlin, Germany 

Tel: +49 30 32639 101 

Fax: +49 30 32639 111 

www.rzpd.de 

This clone is available royalty-free from RZPD; 

contact RZPD (clone@rzpd.de) for further information. Seq primer: 
MlJr, Primer sequence: TTTCACACAGGAAACAGCTATGAC . 

Location/Qualifiers 

1. .362 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 

/clone="IMAGp998M133119 ; IMAGE : 1240116" 
/ tissue_type="parathyroid tumor" 



BASE COUNT 
ORIGIN 



/ de v__s t a ge= " adul t " 

/lab__host="DH10B (ampicillin resistant)" 

/ clone__lib="Soares_parathyroid_tuinor_NbHPA" 

,/note="Organ: parathyroid gland; Vector: pT7T3D (Pharmacia 
) with a modified polylinker; Site_l: Not I; Site 2: Eco 
RI; 1st strand cDNA was primed with a Not I - oligo(dT) 
primer 

[5 ' -TGTTACCAATCTGAAGTGGGAGCGGCCGCACCAATTTTTTTTTTTTTTTTTTTT 
/ double-stranded cDNA was size selected, ligated 
to Eco RI adapters (Pharmacia), digested with Not I and 
cloned into the Not I and Eco RI sites of a modified pT7T3 
vector (Pharmacia). Library went through one round of 
normalization to a Cot = 5 , Library constructed by Bento 
Soares and M. Fatima Bonaldo. RNA from sporadic parathyroid 
adenomas was kindly provided by Dr. Stephen Marx, National 
Institute of Diabetes and Digestive and Kidney Diseases, 

64 a 100 c 120 g 77 t 1 others 



Query Match 20.4%; Score 182.6; DB 13; 

Best Local Similarity 85.3%; Pred. No. 2.3e-33; 
Matches 203; Conservative 0; Mismatches 35; 



Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



Length 362; 
Indels 0; Gaps 



0; 



656 



GCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC7UVCAGCG 715 

I I I I I I I I M I I I I I I I I I I I I I I M I I I I M I I I I I I I II II II II Ml II I Mil II 

3 62 GCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGNCCGGCTTTACGTCAACAG 



^GCG 303 



716 



TGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGT 775 
I ' I I I I I I I I I II M I II II II II M II I II I II II II M M II II I M II I I I I I II II 
302 TGAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAG 



^GT 243 



776 



CCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCA 835 
I I I I II II I II I II I II II I II I II I I II II II I II II I I II II I I I II II I I II I II I I 
242 CCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCA 183 



836 AGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTC 

_ II INI III II II Mil II I MM 

182 AGGCACATGGGCTGCACTGCTTAGAACTTGGTACCCAGAGCCACCACTTCCC 



893 



:CCATCTC 125 



Search completed: January 14, 2004, 11:47:09 
Job time : 2320.58 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: January 14, 2004, 07:12:21 ; Search time 3588 Seconds 

(without alignments) 

10227.407 Million cell updates/sec 

'^^tle: US-09-864-675-3 
Perfect score; 897 

sequence: 1 atgaggcgcgacccggcccc caatggtcaacttctcctaa 897 

Scoring table: IDENTITY NUC 

Gapop loTo , Gapext l.Q 

Searched: 2888711 seqs, 20454813386 residues 

Total number of hits satisfying chosen parameters: 5777422 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : GenEmbl : * 

1 : gb_ba : * 

2: gb_htg:* 

3 : gb_in : 

4: gb_om:* 

5: gb_ov:* 

6: gb_pat:* 

7 : gb_ph : * 

8 : gb_pl : * 

9: gb_pr:* 

10: gb_ro:* 

11: gb_sts:* 

12: gb_sy;* 

13; gb_un:* 

14: gb_vi:* 

15: em__ba: * 

16: em_f un : * 

17 : em_hum: * 

18: em_in:* 

19: em__mu : * 

2 0 : em_om : * 

21: em_or:* 

22: em_ov:* 

23: em__pat:* 

2 4 : em__ph : * 

25: em__pl : * 

2 6 : em_ro : * 

27 : em sts : * 



28: 

29: 

30: 

31: 

32: 

33: 

34: 

35: 

36: 

37: 

38: 

39: 

40: 

41: 



em__un : 
em_vi : 
em_htg 
em_htg 
em__htg 
em_htg 
em_htg 
ern__htg 
em__htg_ 
em_htg_ 
ein__s y : * 
em_htgo 
em_htgo 
em_htgo 



_huin: * 
_inv: * 
_other : * 
mus : * 
pin: * 
rod: * 
mam : * 
vrt : * 

_hum: * 
mus : * 
other: * 



Pred, No. 



is the number of results predicted by chance to have ^ 



SUMMARIES 



Result 




Query 


No . 


o core 


Match Length 




o fi y 


94.6 3020 


2 


ft ^ S A 


yo.l 1884 


3 


ft ^ S A 


^3.1 1884 


4 


7ft 4 


o / . 4 34 41 


5 


7 ^ft 


tf2.3 993 


6 


7^7 


OZ.2 2947 


7 


7 "^7 


. Z 3076 


8 


737 


ft 9 9 '3 m T 


9 


427. 8 


47.7 1476 


10 


427. 8 


47.7 1476 


11 


427. 8 


47.7 2268 


12 


427.8 


47.7 2268 


13 


425 


47.4 2188 


14 


424. 8 


47.4 118504 


c 15 


424. 8 


47.4 152838 


16 


424.8 


47.4 170797 


17 


424. 8 


47.4 210675 


18 


424 


47.3 1054 


19 


424 


47.3 1054 


20 


405,4 


45.2 1607 


21 


405.4 


45.2 1607 


22 


405.4 


45.2 2467 


23 


405. 4 


45.2 -2467 


24 


387.2 


43.2 140307 


25 


384 


42,8 253462 


26 


216.2 


24.1 1207 


27 


173 


19.3 419 


28 


173 


19.3 419 


29 


173 


19.3 120236 


c 30 


173 


19,3 189050 


31 


142.6 


15.9 85703 


c 32 


142.6 


15.9 191101 


33 


139.4 


15.5 226038 



ID 



Description 



9 AB005060 
6 AR098145 
6 AR116617 
6 AR072052 
6 AR072053 

10 D89995 
6 E16456 
10 D89996 
6 AR098146 
6 AR116618 
6 AR098155 
6 AR116627 
10 AB001576 
9 AC094080 
2 AC011589 

9 AC011379 
2 AC026272 
6 AX406616 
9 HS2NRG01 
6 AR098144 
6 AR116616 
6 AR098143 
6 AR116615 
2 AC131191 
2 AC096477 
6 AR072054 
6 AX406617 
9 HS2NRG02 
9 AC008523 
9 AC008667 
2 AC020830 
2 AC127350 
2 AC106592 



AB005060 Homo sapi 
AR098145 Sequence 
AR116617 Sequence 
AR072052 Sequence 
AR072053 Sequence 
D89995 Rattus sp, 
E16456 Rat mRNA fo 
D89996 Rattus sp. 
AR098146 Sequence 
AR116618 Sequence 
AR098155 Sequence 
AR116627 Sequence 

AB001576 Rattus sp 
AC094080 Homo sapi 
AC01158 9 Homo sapi 
AC011379 Homo sapi 
AC026272 Homo sapi 
AX406616 Sequence 
AF119151 Homo sapi 
AR098144 Sequence 
AR116616 Sequence 
AR098143 Sequence 
AR116615 Sequence 
AC131191 Mus muscu 
AC096477 Rattus no 
AR072054 Sequence 
AX406617 Sequence 
AF119152 Homo sapi 
AC008523. Homo sapi 
AC008667 Homo sapi 
AC020830 Mus muscu 
AC127350 Mus muscu 
AC106592 Rattus no 



34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 



139.4 
139.4 
130.2 
124. 6 
124. 6 
124 
124 
109. 6 
108. 4 
108.4 
97 
84.8 



15 
15 
14 
13 
13 
13. 8 
13. 8 
12.2 
12.1 
12.1 
10.8 
9.5 



273080 
302176 
163 
493 
493 
350 
350 
85703 
206683 
220700 
172 
240 



2 AC098540 
2 AC096479 
10 AY227026 
6 AX406618 
9 HS2NRG03 
6 AX406619 

9 HS2NRG04 
2 AC020830 
2 BX323592 
2 BX005008 

10 D89997 
10 AY227025 



AC098540 Rattus no 
AC096479 Rattus no 

AY227 026 Mus muscu 
AX406618 Sequence 
AF119153 Homo sapi 
AX406619 Sequence 
AF119154 Homo sapi 
AC020830 Mus muscu 
BX323592 Danio rer 
BX005008 Danio rer 

D89997 Rattus sp. 

AY227025 Mus muscu 



ALIGNMENTS 



RESULT 1 
AB005060 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



PRI 14-NOV-1997 



CDS 



^005060 3020 bp mRNA linear 

Homo sapiens mRNA for NTAK, complete cds . 
AB005060 

AB005060.1 GI:2626738 
NTAK. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. . 

1 (sites) 

Higashiyama,S., Horikawa,M., Yamada,K., Ichino,N., Nakano.N., 
Nakagawa,T., Miyagawa,J., Matsushita, N. , Nagatsu^T., Taniguchi,N. 
and Ishiguro,H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 3020) 
Ishiguro^H. 

Direct Submission 

Submitted (24-JUN-1997) Hiroshi Ishiguro, Fujita Health University, 
ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 
(E-mail:hishi@fujita-hu. ac.jp, Tel : 0562-93-9393, Fax:0562-93-8831) 

Location/Qualif iers 

1. .3020 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/cell_line="SK-N-SH" 

/cell_type="neuroblastoma" 
226. .2778 

/codon_start=l 
/product="NTAK" 
/protein_id="B7\A23417 .1" 
/db_xref ="GI : 2626739" 

/translation="MRQVCCSALPPPPLEKGRCSSYSDSSSSSSERSSSSSSSSSESG 
SSSRSSSNNSSISRPAAPPEPRPQQQPQPRSPAARRAAARSRAAAAGGMRRDPAPGFS 



MLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLVPAGGSSSNSTREPPASGRVAL 
VKVLDKWPLRSGGLQREQVISVGSCVPLERNQRYIFFLEPTEQPLVFKTAFAPLDTNG 
KNLKKEVGKILCTDCATRPKLKKMKSQTGQVGEKQSLKCEAAAGNPQPSYRWFKDGKE 
LNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILGKDTVRGRLYVNSVSTT 
LSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGFFGQRCLEKLPLRLYMP 
DPKQKAEELYQKRVLTITGICVALLWGIVCWAYCKTKKQRKQMHNHLRQNMCPAHQ 
NRSLANGPSHPRLDPEEIQMADYISKNVPATDHVIRRETETTFSGSHSCSPSHHCSTA 
TPTSSHRHESHTWSLERSESLTSDSQSGIMLSSVGTSKCNSPACVEARARRAAAYNLE 
ERRRATAPPYHDSVDSLRDSPHSERYVSALTTPARLSPVDFHYSLATQVPTFEITSPN 
SAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGPGPGPGPGPGADMQRSYDSYYYPAA 
GPGPRRGTCALGGSLGSLPASPFRIPEDDEYETTQECAPPPPPRPRARGASRRTSAGP 
RRWRRSRLNGLAAQRARAARDSLSLSSGSGGGSASASDDDADDADGALAAESTPFLGL 
polyA site ^^^^^^^^SPPLCPAADSRTYYSLDSHSTRASSRHSRGPPPRAKQDSAPL" 

/note="39 A nucleotides" 
BASE COUNT 615 a 1015 c 937 g 453 t 

ORIGIN 

Query Match 94.6%; Score 849; DB 9; Length 3020; 

Best Local Similarity 98.3%; Pred. No. 7.3e-152- 

Matches 858; Conservative 0; Mismatches 15; Indels 0; Gaps 0; 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTC;rTrTTrrrTrTr-rpr-r-r*rpr-r^r-^mr^r. 

v^^vj>^v^A iv^x 1 kjk^ i. v_. 1 1 ^^laiji (jI (ji CGCTCGCCTGC 

"< 1 1 > 1 1 > 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , 1 1 ,,, 1 ,,,,, 1 1 ,,,,,,,, , 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 


Db 


502 


561 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGrArrrrTrrTrr-Tr-r-Br- 

'""<< 1 I < 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 , 1 ,, , 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


562 


621 


Qy 


121 


GGCAAGGTACAGGGGCTGGTrrrAr;rrrrrr*r'r^^r'(^7\r-r^m^^7v 7\ a-^t* ^^-^ 

'^v.. xi-iv.rT.vjooo(^ 1 vj^^i i LCAGCTCCAACAGCACCCGAGAGCCG 

"<">> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 


Db 


622 


681 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 
1 < 1 1 "> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 , 1 ,, , 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


682 


741 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 
" X 1 '< 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ,,,,,,,,,,,,, 1 ,,,,,, 1 , , 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 


Db 


742 


801 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
1 " < 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , 1 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 


Db 


802 


861 


Qy 


361 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 
<"""> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


420 


Db 


862 


921 


Qy 


421 


TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 
'">' 1 >> 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 j M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 


480 


Db 


922 


981 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
1 < 1 1 !> 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 


540 


Db 


982 


1041 



Qy 


541 


Db 


1042 


Qy 


601 


Db 


1102 


Qy 


661 


Db 


1162 


Qy 


721 


Db 


1222 


Qy 


781 


Db 


1282 


Qy 


841 


Db 


1342 



600 



GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 

I I I I I I I I I ' I I I I I I I I I I I I I M I I M M I I I I I I I I I I I I I I I I M I I I I I M I I I I 

GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 1101 



AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 
<'>''' I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I , I I I I ,,,,,, I I ,,,, , 

AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 1161 

GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I < I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I M I t I I I I I I I I M I I I I I I I I 
GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 1221 

ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 7 8 0 

'''' I I I I ' I I I I I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I M I M M I I I I I I I 

ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 1281 
TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

ll'lll'IIIIIIIIIIIMIIIMllMIIIIIIIMIMllMiiiiiMIIII III 

TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1341 



I I 



I I 



I I I 



RESULT 2 
AR098145 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



^098145 1884 bp DNA 

Sequence 5 from patent US 6074841 
AR098145 

AR098145. 1 GI: 12807402 



linear PAT 14-FEB-2001 



Unknown. 

Unknown. 

Unclassified. 

1 (bases 1 to 1884) 

Gearing, D. P. and Busf ield, S . J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6074841-A 5 13-JUN-2000; 

Location/Qualifiers 

1. .1884 

/organi sm= "unknown " 
426 a 607 c 560 g 291 t 



Query Match - 93.1%; 

Best Local Similarity 98.1%; 
Matches 856; Conservative 



Qy 



Db 



Qy 



Db 



Score 835,4; DB 6; 
Pred. No. 2,9e-149; 
0; Mismatches 16; 



Length 18 84; 



Indels 



1; Gaps 



1; 



1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 

' I I I I I I I I II I I M I I II II I I I I I II I I I I I I II II II I M M II I I II I II M II II 

2 1 8 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTG 



60 



C 277 



61 



TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I ' ' I ' I I I I II M M I II M I I I I I M M I I M M I I I II M I I M M II I I M I 

278 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 



Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

IIIIIIIIIIIIIMIIIIIIIIIIMIIIMIIIIIIIIIIMIIIIIIIlMllllll 

Db 338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

_ I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I , I I , I ,, I I I I , I , I I 

Db 398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 



Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

> I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I , I I ,,, I ,,,,,, I , 

AACCAG 517 



Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGG 



301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I X 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M I n I I I I I I I I I I I I I I I I I I I I I I I 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 



Db 518 
Qy 361 



CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 



420 



'"11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I M I I I I I 

Db 578 CCCCT-GATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 636 

Qy 421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 480 

"> I I I I I I I I I I I I M I I I I I I M I I I M I I I I I I I I I I I I I I I I I ,,, I ,,,,,,,, , 

TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 696 



Db 637 

Qy 481 

Db 697 

Qy 541 

Db 757 



CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 
I I > I I I I I I M I I I I i I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 
' I > I I >> I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I , I , , 

GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 816 



Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 
I " I I > I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I 

\KG 876 



Db 817 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCa 



G<=^CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGT' 



GAGC 720 

ii!llllllliiiii'''''''lllllll"llllllllll|||||||MII 

GC 936 



9 '7'^ GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGA 



Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 

' I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I M I M I I I M I I I I I I I I I 

AT 996 



Db 937 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCT. 



Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

illliliiiillllilUl' IN I I I I I I I I M I I I I M I I I II I I M I I I I I I I I 



997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAA^ 



941 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 
II I M I I III I I I II I III 

Db 1057 CCAAATGGATTCTTCGGACAGAGATGTTTGGAG 108 9 



RESULT 3 
AR116617 

LOCUS AR116617 



1884 bp DNA linear PAT 16-MAY-2001 



DEFINITION Sequence 5 from patent US 6133423. 
ACCESSION AR116617 

VERSION AR116617,1 GI: 14096939 

KEYWORDS 

SOURCE Unknown. 

ORGANISM Unknown . 

Unclassified. 
REFERENCE 1 (bases 1 to 1884) 

AUTHORS Gearing, D. P. and Busf ield, S . J, 

TITLE Don-1 gene and polypeptides and uses therefor 

JOURNAL Patent: US 6133423-A 5 17-OCT-2000; 
FEATURES Location/Qualifiers 
source 1. .1884 

/ organism="unknown" 
BASE COUNT 426 a 607 c 560 g 291 t 

ORIGIN 

Query Match 93.1%; Score 835.4; DB 6; Length 1884; 

Best Local Similarity 98.1%; Pred. No. 2.9e-149; 

Matches 856; Conservative 0; Mismatches 16; Indels 1; Gaps 1 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I M I I I 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 
TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

N I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I M I I I I I I I I I I I I M I 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 
I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I M I 
GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 
I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I 
CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 
I I N I I I I I I I I I I I I I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

CCCCTCGATACCAACGGCAAATyVTCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 420 

Mill I M I I I I I I I I I I M I I I I I I I I I I I I I t I I I I I I I I I I I I I I I I I I 

CCCCT-GATACCAACGGCAAAAATCTCTUVGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 636 

TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 4 80 
I I I I I I I I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I 
TGCGCCACCCGGCCCAAGTTGAAGTyVGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 696 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

N I I I I I I M I I I I I I I I I I I I I I I I M I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 



Qy 


1 


Db 


218 


Qy 


61 


Db 


278 


Qy 


121 


Db 


338 


Qy 


181 


Db 


398 


Qy 


241 


Db 


458 


Qy 


301 


Db 


518 


Qy 


361 


Db 


578 


Qy 


421 


Db 


637 


Qy 


481 


Db 


697 



Qy 

Db 



541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I I I 
757 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 816 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



601 



817 



661 



877 



721 



AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 
I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 
AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 

GCCGAG7\ACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 
I I I I I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 
GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 



ACCACCCTGTCATCCTGGTCGGGGCACGCCCGG7\AGTGCAACGAGACAGCCAAGTCCTAT 
I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
937 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 



660 



876 



720 



936 



780 



996 



781 



997 



TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 



841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 
II Mill III I I I I II III 

1057 CCAAATGGATTCTTCGGACAGAGATGTTTGGAG 1089 



RESULT 4 
AR072052 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR072052 3441 bp DNA 

Sequence 1 from patent US 5912326. 
AR072052 

AR072052 .1 GI : 7222940 

Unknown , 

Unknown. 

Unclassified. 

1 (bases 1 to 3441) 

Chang, H. 

Cerebellum- derived growth factors 
Patent: US 5912326-A 1 15-JUN-1999; 

Location/Qualifiers 

1. .3441 

/ organism="unknown" 
777 a 1057 c 1015 g 592 t 



linear 



PAT 18-FEB-2000 



Query Match 87.4%; 
Best Local Similarity 92.2%; 
Matches 826; Conservative 



Score 784; DB 6; Length 3441; 
Pred. No. 1.8e-139; 
0; Mismatches 70; Indels 0; 



Gaps 



0; 



Qy 

Db 

Qy 
Db 



180 



61 



240 



ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

M I I I I II II I M I M I I I I I I M II I I I I II I I I I I I I M I I M I I I I I I I II I I I I 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 239 



TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
I N I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II II 
TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 



120 



299 



yy 


101 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 

1 N 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 


180 




"5 n n 
0 u u 


359 


vY 


X 0 1 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

1 1 i 1 1 N 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 

CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


UD 




419 


yy 


0/11 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGTyUVGGAACCAG 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 M 1 1 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 


300 


JJD 


/ion 


479 


wy 


0 Ul 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 


360 




4 0 U 


539 


wy 


obi 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 
II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CCGGTCGACCCTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


420 




C /t A 


599 


Qy 




TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 

1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 II 1 1 

TGCGCAACCCGGCCCAAGCTGT^GAAGATGAAGAGTCAGACAGGAGAGGTGGGCGAGAAG 


480 


UD 


£r n n 
bUU 


659 


yy 


/I Q 1 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
II 1 1 1 1 1 1 1 1 1 1 II 1 II 1 II II II II 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 
CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


540 


UO 


ddU 


719 


Qy 


o4 1 


GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 
II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 
GACGGCAAGGAGCTCAACCGGAGTCGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 


600 


nK 


T 0 rt 


779 


yy 


DU 1 


AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 
1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 II 1 1 1 II Mill III 
AACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTGTGAG 


660 


JJJD 


TOO 

/ 0 u 


839 


yy 


bbl 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 
II 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 


720 




o4 U 


899 


yy 


TOT 


ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 
1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 
ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTGCTAC 


780 




Q n n 
3* u u 


959 


vy 


/ 0 1 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 
1 > II 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAGTGT 


840 


Db 


960 


1019 


Qy 


841 


CCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTCCTA 8 96 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 
CCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTCCAA 1075 




Db 


1020 





RESULT 5 
AR072053 

LOCUS AR072053 993 bp DNA linear PAT 18-FEB-2000 

DEFINITION Sequence 3 from patent US 5912326. 



ACCESSION 7VR072053 

VERSION AR072053.1 01:7222941 

KEYWORDS 

SOURCE Unknown. 

ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 993) 

AUTHORS Chang, H. 

TITLE Cerebellum-derived growth factors 

JOURNAL Patent: US 5912326-A 3 15-JUN-1999; 
FEATURES Location/Qualifiers 
source 1. .993 

/ organism="unknown" 
BASE COUNT 230 a 271 c 311 g 181 t 

ORIGIN 

Query Match 82.3%; Score 738.6; DB 6; Length 993; 

Best Local Similarity 90.4%; Pred. No. 8.6e-131; 



Matches 78 9; Conservative 0; Mismatches 84; Indels 0; Gaps 


0 


Ov 


X 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 

1 N 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 

^± i CiUCaAiGCi GCTCTTCGGTGTGTCACTCGCCTGC 


60 


Db 


1 


60 


Ov 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

j-jA^^j. v^vjv^v^i^^vj^v^ J. ^/A/\tj± (^(^(ji (jL.A(jQjAUUA(jGUGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


61 


120 


Qv 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCC7\ACAGCACCCGAGAGCCG 
1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III II 1 1 II 1 1 1 II 1 1 1 1 1 1 II M II 1 1 1 n 1 II 1 
GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGGACCCGAGAGCCT 


180 


Db 


121 


180 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGT7\AAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 
1 N 1 1 1 II 1 1 1 1 II 1 II II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 
CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


181 


240 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

N M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II M II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 


300 


Db 


241 


300 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 
CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 


360 


Db 


301 


360 


Qy 


361 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 

I 1 MM 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CCGGTCGACCCTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


420 


Db 


361 


420 


Qy 


421 


TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 
TGCGCAACCCGGCCCAAGCTGAAGAAGATGAAGAGTCAGACAGGAGAGGTGGGCGAGAAG 


480 


Db 


421 


480 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 
II Mill 1 1 1 1 1 1 1 1 1 1 1 II II II II 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 

CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


540 


Db 


481 


540 



Qy 



541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 



M I I I I I I I I I I I I I I I I I II II I I II I I II II I II I I I I II I I I I M I I I I II I 

541 GACGGCAAGGAGCTCAACCGGAGTCGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 600 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I II I II I I I I I I I I I M I M I I I I I I I II I M I I I I I II I I I II I 

Db 601 AACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTGTGAG 660 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

II I I I I I I I I I M I I I I I II I I I I M I I I I I I I I M I I I II II M I I I I I I 

Db 661 GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 72 0 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 7 80 

INN I I I I I I I I I I I I I I M I II II I M I I I II I I I I I II I I I I I I II I I I I I II 

721 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 7 80 
Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

N II I I I I I I I I I I I I I II I I I I I M I I I I I I I I II I I I I I I I M I I I I M III 

Db 781 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 840 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II I I I I I III MINI III 

Db 841 CCAAACGGATTCTTCGGACAGAGATGTTTGGAG 873 



RESULT 6 

D89995 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 
FEATURES 

source 



ROD 07-FEB-1999 



D89995 2947 bp mRNA linear 

Rattus sp. mRNA for NTAK alphal, complete cds . 
D89995 

D89995.1 GI:2605629 
NTAK alphal, 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus , 

1 (sites) 

Higashiyama, S. , Horikawa^M., Yamada,K., Ichino,N,, Nakano,N., 
Nakagawa^T., Miyagawa^J., Matsushita, N . , Nagatsu,T., Taniguchi,N. 
and Ishiguro,H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbBS and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 2947) 
Ishiguro,H. 

Direct Submission 

Submitted (21-DEC-1996) Hiroshi Ishiguro, Fujita Health University, 
ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 
(E-mail: hishi@fujita-hu.ac.jp, Tel : 0562-93-9393, Fax : 0562-93-8831 ) 
Sequence updated (28-Feb-1997 ) by:Hiroshi Ishiguro. 

Location/Qualifiers 

1. .2947 

/organism="Rattus sp." 

/mol_type="mRNA" 

/db xref="taxon: 10118" 



CDS 



BASE COUNT 
ORIGIN 



/cell_line-"PC12" 
/cell_type="pheochromocytoma" 
79. .2685 
/codon__start=l 
/product="NTAK alphal" 
/protein__id="BAA23344 .1" 
/db_xref="GI: 2605630" 

/translation="MRQVCCS7VLPPPLEKARCSSYSYSDSSSSSSSNNS3SSTSSRSS 
SRSSSRSSRGSTTTTSSSENSGSNSGSIFRPAAPPEPRPQPQPQPRSPAARRAAARSR 
AAAAGGMRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGS 
SSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTE 
QPLVFKTAFAPVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAA 
AGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILG 
KDTVRGRLHWSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGF 
FGQRCLEKLPLRLYMPDPKQKHLGFELKEAEELYQKRVLTITGICVALLWGIVCWA 
YCKTKKQRRQMHHHLRQNMCPAHQNRSLANGPSHPRLDPEEIQMADYISKNVPATDHV 
IRREAETTFSGSHSCSPSHHCSTATPTSSHRHESHTWSLERSESLTSDSQSGIMLSSV 
GTSKCNSPACVEARARRAAAYSQEERRRAAMPPYHDSIDSLRDSPHSERYVST^TTPA 
RLSPVDFHYSLATQVPTFEITSPNSAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGP 
GPGADMQRSYDSYYYPAAGPGPRRGACALGGSLGSLPASPFRIPEDDEYETTQEC7VPP 
PPPRPRTRGASRRTSAGPRRWRRSRLNGLAAQRARAARDSLSLSSGSGCGSASASDDD 
ADDADGALAAESTPFLGLRAAHDALRSDSPPLCPAADSRTYYSLDSHSTRASSRHSRG 
PPTRAKQDSGPL" 
665 a 945 c 895 g 442 t 



Query Match 82.2%; Score 737; DB 10; Length 2947; 

Best Local Similarity 90.3%; Pred. No. l,6e-130; 

Matches 788; Conservative 0; Mismatches 85; Indels 0; 



Gaps 



0; 



Qy 

Db 



403 



ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I II I I I I I I II M I M I I I I I I I I I 
ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC ' 4 62 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 
M I I I II II I I I I I I II I I I I I I I I I M I I I II I I I I I I I I I I I I I I II I I II I I II I I 

4 63 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 522 

121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I II I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

523 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 582 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I II I I I II I I I I I II I I I I I I I I I I II II I I I I I II I I I I I I I I I I I I I I II I I M 

583 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 642 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I 

643 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 702 



301 



360 



CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
I II I I I I I II I I I I I I I I I I I I I I I I II I I II I I II I I I I I I I I I I I I I I I I I I I I 
703 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 762 



Qy 

Db 



361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 420 

II I I I I I I I I I I I I I I II II I I I I II I I I I I I I I M I I I I II I I I I I I I I I I I 
7 63 CCGGTCGACCCTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 822 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



421 



823 



481 



883 



541 



943 



601 



TGCGCCACCCGGCCCAAGTTGAAG7\AGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I II I II I II I II I 
TGCGC7\ACCCGGCCCAAGCTG7\AG7\AGATGAAGAGTCAGACAGGAGAGGTGGGCGAGAAG 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 

II I I I I I I M I I II M II II II II II I I I I I I II I I I I I II I I I I I I I II 
CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 



480 



882 



540 



942 



600 



GATGGCAAGGAGCTCTUVCCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 
II I I I I II I I I I II I I I I I II II I I I I I II I I II I I I I I I I II I I I I I II I I II I 
GACGGCAAGGAGCTC7U^.CCGGAGTCGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 1002 



660 



AACT CACGACTACAGTT CAACAAGGTGAAGGT GGAGGACGCT GGGGAGTAT GT CT GCGAG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
1003 AACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTGTGAG 1062 



661 



720 



GCCGAGAACATCCTGGGGTy^GGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 
II I II I I I I I II I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II 
1063 GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 1122 



721 



780 



ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCTVACGAGACAGCCAAGTCCTAT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

1123 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 1182 

781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

II II I I I I I I I I I I I I I I I I I I I I I II I.I I I I I I I I I Ml M I I I I I I I I I III 
1183 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 1242 



841 



873 



CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 
II I I I I I III I I I I I I III 

1243 CCAAACGGATTCTTCGGACAGAGATGTTTGGAG 1275 



RESULT 7 

E16456 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



Craniata; Vertebrata; Euteleostomi; 
Sciurognathi ; Muridae; Murinae; 



E16456 3076 bp DNA linear PAT 28-JUL-1999 

Rat mRNA for neuregulin-like Transmembrane Activator for ErbB 
Kinases (NTAK) . 
E16456 

E16456.1 01:5711139 
JP 1998179166-A/l. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodent ia; 
Rattus . 

1 (bases 1 to 3076) 

Higashiyama, S . , Taniguchi, N . , Ishiguro,K. and Nagatsu^T. 
GENE ENCODING RECEPTOR TYPE TYROSINE-KINASE ERB B LIGAND AND ITS 
Patent: JP 1998179166-A 1 07-JUL-1998; 
HIGASHIYAMA SHIGEKI 
OS Rattus sp. (rat) 
PN JP 1998179166-A/l 
PD 07-JUL-1998 
PF 25-DEC-1996 JP 1996356998 

PI HIGASHIY7\MA SHIGEKI, TANIGUCHI NAOYUKI, ISHIGURO KEIJI, PI 



NAGATSU TOSHIHARU 

PC C12N15/09,C07K14/705,C07K16/28,C12N5/10,C12N15/02,C12P21/02, 
PC C12P21/08, 

PC C12Ql/68,G01N33/53,G01N33/566//A61K31/70,A61K38/4 6,A6lK39/395, 
PC A61K48/00, 

PC C07H21/04, (C12N5/10,C12R1:91) , (C12P21/02,C12R1:91) ; CC 
strandedness : Double; 
CC topology: Linear; 

Key Location/Qualif iers 



FH 
FH 
FT 
FT 
FT 
FT 
FT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



source 1. .3076 

/organism- * Rattus sp. * 
/cell_line='PC12' 
CDS 232. .2814 

/product='NTAK protein* 
Location/Qualifiers 
1. .3076 

/organism="Rattus sp." 
/mo l__type— "genomic DNA" 
/db_xref="taxon: 10118" 
673 a 996 c 944 g 463 t 



Query Match 82.2%; 
Best Local Similarity 90.3%; 
Matches 788; Conservative 



Score 737; DB 6; Length 3076; 
Pred. No. 1.6e-130; 
0; Mismatches 85; Indels 0; 



Gaps 



0; 



Qy 

Db 



556 



ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 
I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 615 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

616 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 675 

121 GGC7\AGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

676 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 735 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACT^GTGGCCGCTCCGGAGCGGG 24 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 n I I I I I I I I I I I I I I I I II I I II I I I I I II I I I I II I I 

736 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 795 

241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I II I I I I M I M I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

7 96 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 855 

301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTTKAGACGGCCTTTGCC 360 

r II I I I I I II I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M 

856 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 915 

361 CCCCTCGATACCAACGGCAAAAATCTCAAG7\AAGAGGTGGGC7\AGATCCTGTGCACTGAC 420 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

916 CCGGTCGACCCTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 975 



Qy 



421 TGCGCCACCCGGCCCAAGTTG7\AGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 48 0 
I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I III I I I I II I MINI 



Db 



97 6 TGCGCAACCCGGCCCAAGCTGAAGAAGATG7UVGAGTCAGACAGGAGAGGTGGGCGAGAAG 1035 



Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

II I I I M I I I I I I I I I I I II II II M I II I I I I I I I I I I II I I I I I I I I I 
Db 1036 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 1095 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

II I II I M I I II I II II I I II II II I I I I I I II II II I I I I I I I I I I I I I I I II I 
Db 1096 GACGGCAAGGAGCTCAACCGGAGTCGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 1155 

Qy 601 AACTCACGACTACAGTTCAAC7\AGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I 11 M I I I M I I I I I I I 11 I I I I II I I I I I I I II M I I I I I I I I I I I I I I III 

Db 1156 AACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTGTGAG 1215 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

II I I I M I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I II I I I I I I I I 

Db 1216 GCTGAGAACATCCTTGGGT^AGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 1275 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 7 80 

I I II I I I I I I II I II I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I 

Db 1276 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 1335 

Qy 7 81 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATC7KACCAGCTCTCCTGCAAGTGT 84 0 

II II II I I II I 11 I I M I I I I I I I II I I I I I I I M I I I I I I I II I I I I I I I III 

Db 1336 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 1395 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II Mill III I I I I I I III 

Db 13 96 CCAAACGGATTCTTCGGACAGAGATGTTTGGAG 142 8 



RESULT 8 

D89996 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

0RG7\NISM 



REFERENCE 
AUTHORS 



TITLE 

J0URN7VL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



D89996 3077 bp mRNA linear ROD 07-FEB-1999 

Rattus sp. rtiRNA for NT7VK alpha2, complete cds . 

D89996 

D89996.1 GI:2605631 
NTAK alpha2. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (sites) 

Higashiyama, S . , Horikawa,M., Yamada,K., Ichino,N., Nakano,N., 
Nakagawa^T.^ Miyagawa,J., Matsushita, N . , Nagatsu,T,, Taniguchi^N. 
and Ishiguro,H. 

A novel brain-derived meitiber of the epidermal growth factor family 

that interacts With- ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 3077) 
Ishiguro, H , 

Direct Submission 

Submitted (21-DEC-1996) Hiroshi Ishiguro, Fujita Health University, 
ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 



(E-mail :hishi(afujita-hu. ac.jp, Tel : 0562-93-9393, Fax : 0562-93-8 831 ) 
FEATURES Location/Qualifiers 
source 1. .3077 

/ organism="Rattus sp . " 

/mol_type="mRNA" 

/db_xref="taxon: 10118" 

/cell_line="PC12" 

/ cell_type="pheochromocytoma" 
CDS 233. .2815 

/ codon_start=l 

/product="NTAK alpha2" 

/protein_id="BAA23345. 1" 

/db_xref="GI: 2605632" 

/ translation-"MRQVCCSALPPPLEKARCSSYSYSDSSSSSSSNNSSSSTSSRSS 

SRSSSRSSRGSTTTTSSSENSGSNSGSIFRPAAPPEPRPQPQPQPRSPAARRAAARSR 

AAAAGGMRRDPAPGSSMLLFGVSLACYSPSLKSVQDQAYKAPVWEGKVQGLAPAGGS 

SSNSTREPPASGRVALVKVLDKWPLRSGGLQREQVISVGSCAPLERNQRYIFFLEPTE 

QPLVFKTAFAPVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTGEVGEKQSLKCEAA 

AGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVEDAGEYVCEAENILG 

KDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIEGINQLSCKCPNGF 

FGQRCLEKLPLRLYMPDPKQKAEELYQKRVLTITGICVTVLLWGIVCWAYCKTKKQR 

RQMHHHLRQNMCPAHQNRSL7\NGPSHPRLDPEEIQMADYISKNVPATDHVIRREAETT 

FSGSHSCSPSHHCSTATPTSSHRHESHTWSLERSESLTSDSQSGIMLSSVGTSKCNSP 

ACVEARARRAAAYSQEERRRAAMPPYHDSIDSLRDSPHSERYVSALTTPARLSPVDFH 

YSLATQVPTFEITSPNSAHAVSLPPAAPISYRLAEQQPLLRHPAPPGPGPGPG7U)MQR 

SYDSYYYPAAGPGPRRGACALGGSLGSLPASPFRIPEDDEYETTQECAPPPPPRPRTR 

GASRRTSAGPRRWRRSRLNGLAAQRARAARDSLSLSSGSGCGSASASDDDADD7U)GAL 

AAESTPFLGLRAAHDALRSDSPPLCPAADSRTYYSLDSHSTRASSRHSRGPPTRAKQD 
SGPL" 

BASE COUNT 673 a 996 c 945 g 463 t 

ORIGIN 



Query Match 82.2%; Score 737; DB 10; Length 3077; 

Best Local Similarity 90.3%; Pred. No. 1.6e-130; 

Matches 788; Conservative 0; Mismatches 85; Indels 0; Gaps 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 
N 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III II 1 1 1 II 1 1 1 1 1 1 1 1 II 1 II 1 II 1 1 1 1 1 1 
ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 


60 


Db 


557 


616 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
1 N 1 1 1 1 II 1 II 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II II 1 1 M 1 1 1 1 1 1 1 1 II 1 1 II M 
TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


617 


676 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 

1 N 1 M 1 1 II 1 M 1 MM Ml II Mill II 1 1 II M II M M II M 1 M II M 

GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 


180 


Db 


677 


736 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

1 1 11 II M II M 1 Ill MM II II II M M II II II M M II M 1 1 II II M 1 

CCCGCCTCGGGTCGGGTGGCGCTGGTG7VAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


737 


796 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

N II 1 1 II M 1 1 II II 1 II M II 1 II II M 1 1 1 II II 1 1 II M II II M 1 1 1 II M II 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 


300 


Db 


797 


856 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 



I I I I I I I I I I I I I I I M I I I I I I I I II I I I I I 11 I 11 I II I II I II I I I I II Ml 



Db 


857 


CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTTiAGACAGCCTTTGCC 


916 


Ov 


361 


(^^^v^t^i L. i UAAtjAAACirACjG 1 GGGCAAGATCCTGxGCACTGAC 

N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 

CCGGTCGACCCTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


420 


Db 


917 


976 


Ov 


421 


J. vj (a ^iv^ f^^^KjKoK^K^ i 1 i CjAACjAG C C AGAC G G GAG AG GT G G GT GAGAAG 

Mill 1 1 1 1 1 M 1 1 1 1 1 1 11 1 1 1 1 1 1 II 1 1 II 1 Mill III 1 1 1 1 1 1 1 1 1 1 1 1 1 
TGCGCAACCCGGCCCAAGCTGAAGAAGATGAAGAGTCAGACAGGAGAGGTGGGCGAGAAG 


480 


Db 




1036 


Ov 


M O X 


^■^^-L^^^ J- ij/\tjijU/\(jCACjUuCjGiAAi CCuCAGCCTTCCTACCGTTGGTTCAAG 

II INN 1 M 1 1 1 1 1 1 1 1 II II II II 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 M 1 1 

CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


540 


Db 


X U O / 


1096 


Ov 


S4 1 

<j '± J. 


(jtj<-/\A(jtjAtjUi CAAUCGOAGCCGAGACATTCGCATCAAATATGGCAA 

II M 1 M 1 1 1 1 1 1 M 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 

GACGGCAAGGAGCTCAACCGGAGTCGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 


600 


Db 


1 nQ7 


1156 


Ov 




Art-L. 1 i /\UA(cr i i L-AACAAGG T GAAGGT GGAGGACGCTGGGGAGTAT GTCT GCGAG 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 III 

AACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTGTGAG 


660 


Db 


1157 


1216 


Ov 


D O X 


(jUUtjAtaAAUAi (^^--1 CjCjCjCtAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 

II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 
GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 


720 




X Z. X / 


1276 


Ov 


791 

1 Oil 


AL.UA(^L-L.i Qji UAi J. GGi GGGGGCACGCCCGGAAGTGC7VA.CGAGACAGCCAAGTCCTAT 

1 1 1 1 1 Mill M 1 1 II M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 


78 0 
1336 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATC7VACCAGCTCTCCTGCAAGTGT 
II II 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 II II 1 M 1 1 II M 1 1 II M II M M 1 M II Ml 

TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 


840 


Db 


1337 


1396 


Qy 


841 


CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

M 1 II 1 1 III 1 II M 1 III 

CCAAACGGATTCTTCGGACAGAGATGTTTGGAG 142 9 




Db 


1397 





RESULT 9 
AR098146 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



AR098146 1476 bp DNA linear 

Sequence 7 from patent US 6074841. 

AR098146 

AR09814 6. 1 GI : 12807403 

Unknown. 

Unknown. 

Unclassified. 

1 (bases 1 to 1476) 

Gearing, D. P. and Busf ield, S . J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6074841-A 7 13-JUN-2000; 

Location/Qualifiers 

1. .1476 

/organism="unknown" 



PAT 14-FEB-2001 



BASE COUNT 335 a 473 c 452 g 216 t 

ORIGIN 



Query Match Al ,1%; Score 427.8; DB 6; Length 1476; 

Best Local Similarity 89.8%; Pred, No. 1.4e-71; 

Matches 459; Conservative 0; Mismatches 52; Indels 0; Gaps 0; 

Qy 363 CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG 422 

' I I I I I M III I I I I II I I I I I I II 

98 CCGCGGCAAGAAGCACCCAGAGGGGAGG7VAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 

Qy 423 CGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 4 82 

I I M I I II II I I I I M I I I I II II II I M I II II I I I I II I I II M I II I I II I II I II 
Db 158 AGCCACCCGGCCCTWVGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 217 



Qy 483 ATCGCTGTVAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTGCTACCGTTGGTTCAAGGA 542 

I I M I M I M M M II I I I I I I I I I I I M I M M I M I I I M I I I M I I I M I I I I I I II 

Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy S43 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGC7\ACGGCAGAAAGAA 602 

I I I M M I I I I I I I I M I I M I M I I I I I I I I I I I I I I M I I M 11 M M I I I I I I M II 

Db 27 8 TGGCAAGGAGCTC7UVCCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 337 



Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I I I I I M I I I I I I I II I I I I I I I I I I I M II I I I M I I I I I I I I I I I M I I I II I I I I I I 
Db 338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I M M I I M I I I I I I M I I I I I II I I I I I I M I I I I I I I I I I I I I II I I I I I I M I M II 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

N I I I I I I I M I M I I I I I I I I I I M M I I I I I M I I I I I I I I I I I I M I M I I M I I I I 

Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCTyVCGAGACAGCCAAGTCCTATTG 517 

Qy 783 CGTC7UVTGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCC 842 

M I I I I M I I I I I I I II I M I I I I I I I M I I I I I I M I M I I M I I I II M I M Mill 

Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 84 3 TGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

I I I I I I M . I I I I M I M 
Db 578 AAATGGATTCTTCGGACAGAGATGTTTGGAG 608 



RESULT 10 

AR116618 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 



AR116618 
Sequence 
AR116618 
AR116618. 



7 from patent 



1476 bp 
US 6133423. 



DNA 



linear 



PAT 16-MAY-2001 



GI: 14096940 



Unknown . 

Unknown . 

Unclassified. 

1 (bases 1 to 1476) 

Gearing, D. P. and Busf ield, S . J . 

Don-1 gene and polypeptides and uses therefor 



JOURNAL Patent: US 6133423-A 7 17-OCT-2000; 
FEATURES Location/Qualifiers 
source 1. .1476 

/ organism= "unknown" 
BASE COUNT 335 a 473 c 452 g 216 t 

ORIGIN 

Query Match 47.7%; Score 427.8; DB 6; Length 1476; 

Best Local Similarity 89.8%; Pred. No. 1.4e-71; 

Matches 4 59; Conservative 0; Mismatches 52; Indels 0; Gaps 0; 

Qy 363 CCTCGATACCAACGGCAAAAATCTC7\AGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG- 422 

I I I I I I I III I I I I II I I II I I II 

Db 98 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 

Qy 423 CGCCACCCGGCCCAAGTTGAAGAAGATG7\AGAGCCAGACGGGACAGGTGGGTGAGAAGCA 482 

I M I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I M I I I 
Db 158 AGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I II I II I I I I I I I I I II I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGT7VATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 602 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I 
Db 278 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCATW^TATGGCAACGGCAGAAAGAA 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I 
Db 338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 CGAGAACATCCTGGGGTyVGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCTVAGTCCTATTG 782 

I I II I I I I I I I I I I I I I I I I I I I M I II I II M I M I I I I I I I I I I I Ml I I I I I I II M 

Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 783 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCC 842 

M I I II I I II I II II I I I M I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 TGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

I I I I I III I II I I I III 
Db 57 8 TWVTGGATTCTTCGGACAGAGATGTTTGGAG 608 



RESULT 11 

AR098155 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



AR098155 
Sequence 
AR098155 
AR098155. 

Unknovm . 
Unknown . 



22 68 bp DNA 
31 from patent US 6074841. 

1 GI:12807412 



linear 



PAT 14-FEB-2001 



Unclassified. 
REFERENCE 1 (bases 1 to 22 68) 

AUTHORS Gearing, D. P. and Busf ield, S . J. 

TITLE Don-1 gene and polypeptides and uses therefor 

JOURNAL Patent: US 6074841-A 31 13-JUN-2000; 
FEATURES Location/Qualifiers 
source 1. .2268 

/ organism^"unknown " 
BASE COUNT 502 a 734 c 701 g 331 t 

ORIGIN 



Query Match 47.7%; Score 427.8; DB 6; Length 2268; 

Best Local Similarity 89.8%; Pred. No. 1.4e-71; 
Matches 459; Conservative 0; Mismatches 52; Indels 



0; Gaps 



0; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



363 



CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG 422 
I I M I I I III I I I I M I I I I I I II 

98 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 



423 



158 



483 



CGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 482 

I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I II I I I II II II I I 
AGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 217 



ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 
N I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 



543 



278 



TGGCAAGGAGCTCT^ACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGTU^ 602 

I < I I I I M i I I I I I I I I I I I I M I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I 

TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG7WVGAA 337 



603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I ' I I I I I I I I I I I I I II I I I I I I I II I I I I I I II II I II I I I I I I I I II I I I I I I I II I I 
338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 



663 



398 



723 



458 



783 



CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 
> I I I I I I I I I II II I I I II I I I I I II II I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I 
CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCC7\AGTCCTATTG 7 82 
I I I I M I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
CACCCTGTCATCCTGGTCGGGGCACGCCCGGTyVGTGCAACGAGACAGCCTUVGTCCTATTG 517 



CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCC 842 
I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I Mill 
518 CGTC7VATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

843 TGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

mil III I [ I 1 1 1 I I r ^ 

578 AAATGGATTCTTCGGACAGAGATGTTTGGAG 608 



RESULT 12 

AR116627 

LOCUS 

DEFINITION 
ACCESSION 



AR116627 2268 bp DNA 

Sequence 31 from patent US 6133423. 
AR116627 



linear PAT 16-MAY-2001 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR116627. 1 GI: 14 09694 9 

Unknown . 

Unknown. 

Unclassified. 

1 {bases 1 to 2268) 

Gearing, D. P. and Busf ield, S . J. 

Don-1 gene and polypeptides and uses therefor 
Patent: US 6133423-A 31 17-OCT-2000; 

Location/Qualifiers 

1. .2268 

/ organism="unknown" 
502 a 734 c 701 g 331 t 



Query Match 47.7%; Score 427.8; DB 6; Length 2268; 

Best Local Similarity 89.8%; Pred. No. 1.4e-71; 

Matches 4 59; Conservative 0; Mismatches 52; Indels 0; Gaps 



Ov 




CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG 

N 1 1 1 1 1 III 1 M 1 II 1 1 1 1 1 1 II 

u UC^U CjCjUAAGAAGCAC CC AGAG GGGAGGAAGC GGGAGAGGGAGC C C GAT C C C GGGGAGAA 


422 


Db 


Z7 O 


157 


Sd^y 


H O 


CGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 

N 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 II II 1 1 1 1 1 

AIjUUAUOUCjGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 


482 


Db 


1 S ft 

-L J O 


217 


wy 


fi O O 


ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 


542 


Db 


218 


277 


Qy 


543 


TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 

N 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 .11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 


602 


Db 


278 


337 


Qy 


603 


CTCACGACTACAGTTCAAC7\AGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 
1 1 1 1 1 1 1 1 N 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 


662 


Db 


338 


397 


Qy 


663 


CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC7UVCAGCGTGAGCAC 
1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 
CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 


722 


Db 


398 


457 


Qy 


723 


CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CACCCTGTCATCCTGGTCGGGGCACGCCCGGTW^GTGCAACGAGACAGCCAAGTCCTATTG 


782 


Db 


458 


517 


Qy 


783 


CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCC 

1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 II 1 1 

CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 


842 


Db 


518 


577 


Qy 


843 


TGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

Mill III 1 1 1 1 1 1 III 
AAATGGATTCTTCGGACAGAGATGTTTGGAG 608 




Db 


578 





RESULT 13 



7^001576 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



ROD 13-FEB-1999 



CDS 



BASE COUNT 
ORIGIN 



^001576 2188 bp mRNA linear 

Rattus sp. mRNA for NTAK alpha2-lp, partial cds 
AB001576 

AB00157 6.1 GI: 2 605478 

neural- and thymus -derived activator for ErbB kinases. 
Rattus sp. 
Rattus sp. 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus. 

1 (sites) 

Higashiyama,S., Horikawa,M., Yamada.K., Ichino,N., Nakano,N., 
Nakagawa.T., Miyagawa,J., Matsushita, N . , Nagatsu^T., Taniguchi^N 
and Ishiguro,H. 

A novel brain-derived member of the epidermal growth factor family 

that interacts with ErbB3 and ErbB4 

J. Biochem. 122 (3), 675-680 (1997) 

98006324 

9348101 

2 (bases 1 to 2188) 
Ishiguro,H. 

Direct Submission 

Hiroshi Ishiguro, Fujita Health University, 
ICMS; 1-98, kutsukake-cho, Toyoake, Aichi 470-11, Japan 
(E-mail :hishi(afujita-hu. ac.jp, Tel : 0562-93-9393, Fax:0562-93-8831) 

Location/Qualifiers 

1. .2188 

/organism="Rattus sp." 
/mol__type="mRNA" 
/db_xref="taxon: 10118" 
/cell_line="PC12" 

/ cell_type="pheochromocytoma" 
<1. .1926 

/note-"neural- and thymus-derived activator for ErbB 
kinases (NTAK) ; a member of the epidermal growth factor 
(EOF) family" 
/codon_start=l 
/product="NTAK alpha2-lp" 
/protein_id="B7\A2334 8. 1" 
/db_xref="GI : 2605479" 

/translation="FFFFKTAFAPVDPNGKNIKKEVGKILCTDCATRPKLKKMKSQTG 

EVGEKQSLKCEAAAGNPQPSYRWFKDGKELNRSRDIRIKYGNGRKNSRLQFNKVKVED 

AGEYVCEAENILGKDTVRGRLHVNSVSTTLSSWSGHARKCNETAKSYCVNGGVCYYIE 

GINQLSCKCPNGFFGQRCLEKLPLRLYMPDPKQKAEELYQKRVLTITGICVALLWGI 

VCWAYCKTKKQRRQMHHHLRQNMCPAHQNRSLANGPSHPRLDPEEIQMADYISKNVP 

ATDHVIRREAETTFSGSHSCSPSHHCSTATPTSSHRHESHTWSLERSESLTSDSQSGI 

MLSSVGTSKCNSPACVEARARRAAAYSQEERRRAAMPPYHDSIDSLRDSPHSERYVSA 

LTTPARLSPVDFHYSLATQVPTFEITSPNSAHAVSLPPAAPISYRLAEQQPLLRHPAP 

PGPGPGPGADMQRSYDSYYYPAAGPGPRRGACALGGSLGSLPASPFRIPEDDEYETTQ 

ECAPPPPPRPRTRGASRRTSAGPRRWRRSRLNGLAAQRARAARDSLSLSSGSGCGSAS 

ASDDDADDADGALAAESTPFLGLRAAHDALRSDSPPLCPAADSRTYYSLDSHSTRASS 
RHSRGPPTRAKQDSGPL" 



515 a 



674 



643 g 356 t 



Query Match 



47.4%; Score 425; DB 10; Length 2188; 



Best Local Similarity 87.0%; Pred. No 4 7e-71- 

Matches 467; Conservative 0; Mismatch;s 70; Indels 0; Gaps 



Db 

Qy 
Db 



337 TTAGTCTTTAAGACGGCCTTTGCC^ 

' ' ''"I'll I I I I I I I I I I i I I I I I M I I I I I I I I I I I I I I I I I I I 

TTTTTTTTTAAGACAGCCTTTGCCCCGGTCGACCCT^CGic^CAicIiGiiiG^ 



63 

Qy 397 ^I^^^^^^ATCCTGTGCACTGAC^ 

nK c:. I " I " ' I I ' I I I ' I I I I I N I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

64 C^TGGGCAAGATCCTGTGCACTGACTGCGCAACCCGGCCcAiiciiiiiiiiitiliiil^ 123 
QY 457 ^™«^^^CAGGTGGGTGAGAAG^ 

Db 

Qy 517 



ii'Mii iiiiiiM mil m mmTm MM rTr Mm 

124 CAGACAGGAGAGGTGGGCGAGAAGCAGTCGCTCAAGiGTGAGGCGGiGiiGGLiiciii 183 
517 ^^l^^^^^^^yACCGTTGGTTCAAGGATGG^ 

l\l " ' " " I I I I I M II III III I III I III ill II II II I II II I I 

184 CAGCCCTCCTATCGATGGTTCAAGGACGGCAAGGAGCTCAAciGGAGTiiTGiciiic^ 243 

577 ^-[^^y^y^^^^^^GGCAGAAA^ 

o.. ' ' I I ' I I ' I I I I I N I I III I I I III I III IN III I I II 11 I I I I I I I 

244 ATCAAGTATGGCAACGGCAGAAAGAACTCACGGCTAiAGiTcIlc!:!^.^,^^^ 3^3 

Qy 637 ^ACGCTGGGGAGTATGTCTG^^^^^^ 

IIIIIIM III I I I I I M I I M r I I I . I 



Db 304 

. ^ . '-UTTQ^tcrGAAGGACACTGTGAGGGGC 363 

Qy 817 

r M I I . :V'7:7:7^"^^"'"'"^^^-'^-^^^^^'^'^^A*^<^tfGGGACAGGTGTCAG 
"""" I I I I I I I I I I I I I I I I nil, 11, 11,1,1 I 
ATCAACCAACTCTrr.Tr:rziZiarrr!rTir-^7^ ATx^^^^r ™ _ ' ' ' ' 

GAG 54 0 



ATCAACCAGCTCTCCTGC^^^ 
nK ' I I I I III I II II I I I I I II I I I I I I I II ill II II I I III 

Db 484 atcaaccaactctcctgcaaatgtccaaacggattcttcggacAgagatotttggJ 



RESULT 14 
AC094080 

LOCUS AC094080 llftRn^i k 

DEFINITION Homo sapiens chromosome 5 clone CTB^'^^K?? ^^""r. ''''^ 27-MAR-2002 

ACCESSION AC094080 CTB-77K22, complete sequence. 

VERSION AC094080.4 GI: 19747152 

KEYWORDS HTG. 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi • 

REFERENCE fTas:; T^^.^^^- ^'^ -^^^^ 

~ oiLcfsuw::r:n'^"^'"'^ ^^^-^^^^ — — - 

JOURNAL Unpublished 

REFERENCE 2 (bases 1 to 118504) 

AUTHORS DOE Joint Genome Institute 



TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Direct Submission 

Submitted (14-SEP-2001) Production Sequencing Facility, DOE Joint 
Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598 USA 

3 (bases 1 to 118504) 

DOE Joint Genome Institute. 
Direct Submission 

Submitted (07-MAR-2002) Production Sequencing Facility, DOE Joint 
Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598 USA 

4 (bases 1 to 118504) 

DOE Joint Genome Institute and Stanford Human Genome Center. 
Direct Submission 

Submitted (27-MAR-2002) DOE Joint Genome Institute, 2800 Mitchell 
Drive, Walnut Creek, CA 94598, USA 

On Mar 27, 2002 this sequence version replaced gi: 19224838. 
Draft Sequence Produced by DOE Joint Genome Institute 
www.jgi.doe.gov 

Finishing Completed at Stanford Human Genome Center 
www-shgc. Stanford. edu 

Quality: Phrap Quality >=40 99.7% of Sequence; 
Estimated Total Number of Errors is 0.5. 

Location/Qualifiers 

1. .118504 

/organism="Homo sapiens" 
/mol_type="genomic DNA" 
/db xref="taxon: 9606" 
/ chr omo s ome= " 5 " 
/clone="CTB-77K22 " 
33986 a 23829 c 24682 g 36007 t 



Query Match 47.4%; 
Best Local Similarity 94.2%; 
Matches 452; Conservative 



Score 424.8; DB 9; 
Pred. No. 3.9e-71; 
0; Mismatches 27; 



Length 118504; 
Indels 1; Gaps 



1; 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 

I ' 1 < I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 M 1 M 1 1 1 M 1 1 1 1 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 


Db 


80890 


80949 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
' ' 1 I I > 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 I 1 1 1 1 1 1 , , 1 , 1 , 1 , 1 1 , 1 1 , 1 , , , 1 , 1 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


80950 


81009 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 

' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 M 1 1 M 1 1 1 1 N M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 


Db 


81010 


81069 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


81070 


81129 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 
III I I 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ,,,,, 1 ,,,,,,,, 1 , 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 


Db 


81130 


81189 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 

1 I 1 1 1 ' I I 1 1 1 M 1 1 1 M II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 


Db 


81190 


81249 



Qy 


361 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 
llll'INIIIIIIMIIIIIIIMiillliiiiiii,, 1,111, ,1,11, III, 11,1, 

CCCCTCGATACCAACGGCAAAAATCTC7UVGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


420 


Db 


81250 


81309 


Qy 


421 


TGC-GCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAA 


479 


Db 


81310 


' " ' ' 1 1 1 1 1 1 1 III 1 1 1 1 II M 1 II 1 Mill 
TGCGGTGAGTCGCCCCCTCCCTTTGCTGGAGAAAGGGGGGAGGGGCGAGGTGGTGGAGAA 


81369 



RESULT 15 

AC011589/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 



AC011589 152838 bp DNA linear HTG 30-MAR-2000 

Homo sapiens clone RPll-13018, WORKING DRAFT SEQUENCE, 10 unordered 
pieces. 
AC011589 

AC011589.3 GI:7341917 

HTG; HTGS_PHASE1; HTGS_DRAFT , 

Homo sapiens (human) 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi • 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 152838) 

Birren,B., Linton, L., Nusbaum, C. and Lander, E. 

Homo sapiens, clone RP11-13018 

Unpublished 

2 (bases 1 to 152838) 

Birren,B., Linton, L., Nusbaum,C., Lander, E., Allen, N., Anderson, M. 
Baldwin, J., Barna,N., Beckerly,R., Boguslavkiy, L . , Boukhgalter , B . , ' 
Brown, A., Castle, A., Colangelo,M. , Collins, S., Collymore, A. , 
Cooke, P,, DeArellano,K., Dewar,K., Domino, M., Donelan,L., Doyle, M 
Ferreira,P., FitzHugh,W., Forrest, C, Funke,R., Gage,D., 
Galagan,J., Gardyna,S., Grant, G., Hagos,B., Heaford,A., Horton,L., 
Howland, J.C., Johnson, R., Jones, C, Kann,L., Karatas,A., Klein, J., 
Lehoczky,J., Lieu,C., Locke, K., Macdonald, P . , Marquis, N., 
McEwan,P., McGurk,A, , McKernan,K., McLaughlin, J. , Meldrim,J., 
Morrow, J., Naylor,J., Norman, C.H., O ' Connor, T., O'Donnell P 
Peterson, K., Pollara,V., Riley,R., Roy,A. , Santos, R., Severy'p., 
Stange-Thomann,N., Sto j anovic, N . , Subramanian,A. , Talamas,J., 
Tesfaye,S., Tirrell,A. , Vassiliev, H . , Vo,A. , Wheeler, J., Wu,X., 
Wyman,D., Ye,W.J., Zimmer,A. and Zody,M. 
Direct Submission 

Submitted (07-OCT-1999) Whitehead Institute/MIT Center for Genome 

Research, 320 Charles Street, Cambridge, MA 02141, USA 

On Mar 30, 2000 this sequence version replaced gi: 6139107. 

All repeats were identified using RepeatMasker : 

Smit, A.F.A. & Green, P. (1996-1997) 

http: / / ftp, genome. washington.edu/RM/RepeatMasker, html 
. — Genome Center 
Center: Whitehead Institute/ MIT Center for Genome Research 
Center code; WIBR 

Web site: http : //www-seq. wi ,mit . edu 

Contact : sequence_submissions(agenome . wi .mit . edu 

Pro j ect Information 
Center project name: L3375 
Center clone name: 13 O 18 

Summary Statistics 



FEATURES 

source 



Sequencing vector: M13; M77815; 100% of reads 
Chemistry: Dye-terminator Big Dye; 100% of reads 
Assembly program: Phrap; version 0.960731 
Consensus quality: 147077 bases at least Q40 
Consensus quality: 149570 bases at least Q30 
Consensus quality: 150764 bases at least Q20 
Insert size: 188000; agarose-fp 
Insert size: 151938; sum-of-contigs 
Quality coverage: 4-4 in Q20 bases; agarose-fp 
Quality coverage: 5.4 in Q20 bases; sum-of-contigs 



NOTE: This is a 'working draft* sequence. It currently 
consists of 10 contigs . The true order of the pieces 
is not known and their order in this sequence record i 
arbitrary. Gaps between the contigs are represented as 
runs of but the exact sizes of the gaps are unknown 
This record will be updated with the finished sequence 
as soon as it is available and the accession number wi 





be preserved. 














1 


1840: 


contig 


of 1840 


bp 


in 


length 


-A- 


1841 


1940: 


gap of 


100 bp 










1941 


5083: 


contig 


of 3143 


bp 


in 


length 




5084 


5183: 


gap of 


100 bp 










5184 


6473: 


contig 


of 1290 


bp 


in 


length 




6474 


6573: 


gap of 


100 bp 








6574 


11402: 


contig 


of 4829 


bp 


in 


length 


■k 


11403 


. 11502: 


gap of 


100 bp 










11503 


25453: 


contig 


of 13951 


bp 


in 


length 




25454 


25553: 


gap of 


100 bp 








25554 


40778: 


contig 


of 15225 


bp 


in 


length 




40779 


40878: 


gap of 


100 bp 








40879 


58024: 


contig 


of 17146 


bp 


in 


length 




58025 


58124: 


gap of 


100 bp 








58125 


87982: 


contig 


of 29858 


bp 


in 


length 




87983 


88082: 


gap of 


100 bp 








88083 


121177: 


contig 


of 33095 


bp 


in 


length 




121178 


121277: 


gap of 


100 bp 








121278 


152838: 


contig 


of 31561 


bp 


in 


length. 




Location/Qualifiers 












1. 


.152838 













misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



/ organism="Homo sapiens " 
/mol_type="genomic DNA" 
/db_xref="taxon: 9606" 
/clone="RPll-13018" 

/clone_lib="RPCI-ll Human Male BAC* 
1. .1840 

/note="assembly_f ragment" 
1941. .5083 

/note="assembly__f ragment" 
5184. .6473 

/note="assembly_fragment 
clone_end: SP6 
vector_side : right" 
6574. .11402 

/ note="assembly_f ragment" 
11503. .25453 
/note="assembly_f ragment" 



misc_feature 25554. .40778 

/ note="assembly_f raginent " 
misc_feature 40879. .58024 

/note="assembly_f ragment" 
misc_feature 58125. .87982 

/ no t e= " as seinbly_f ragmen t" 
misc_feature 88083. .121177 

/note="assembly_f ragment" 
misc_feature 121278. .152838 

/note="assembly_f ragment 

clone_end:T7 

vector_side: right" 
BASE COUNT 43896 a 33588 c 32260 g 42194 t 900 others 
ORIGIN 

Query Match 47.4%; Score 424.8; DB2; Length 152838; 

Best Local Similarity 94.2%; Pred. No. 3.8e-71; 

Matches 452; Conservative 0; Mismatches 27; Indels 1; Gaps 1; 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 

N 1 M 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 IN 1 Ml 1 1 1 1 1 1 1 n M 1 M 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 


Db 


54376 


54317 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 


Db 


54316 


54257 


Qy 


121 


GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
GGC7\AGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 


Db 


54256 


54197 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

N 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 


Db 


54196 


54137 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

1 N 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTGGAAAGGAACCAG 


300 


Db 


54136 


54077 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 
1 N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 
CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 


Db 


54076 


54017 


Qy 


361 


CCCCTCGATACCAACGGCAAAT^ATCTCTVAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CCCCTCGATACCAACGGCA7W\ATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


420 


Db 


54016 


53957 


Qy 


421 


TGC-GCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAA 
I I 1 I 1 Kill 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 , 1 1 1 

TGCGGTGAGTCGCCCCCTCCCTTTGCTGGAGAAAGGGGGGAGGGGCGAGGTGGTGGAGAA 


479 


Db 


53956 


53897 



Search completed: January 14, 2004, 10:22:59 
Job time : 3591.5 sees 



GenCore version 5\1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on : 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



January 14, 2004, 07:12:21 ; Search time 290.304 Seconds 

(without alignments) 
8340,911 Million cell updates/sec 

US-09-864-675-3 
897 

1 atgaggcgcgacccggcccc caatggtcaacttctcctaa 897 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 2552756 seqs, 1349719017 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



5105512 



Database 



N_Geneseq_19Jun03: * 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 



/ SIDSl/ gcgdata/geneseq/geneseqn- 
/ SIDSl/ gcgdata/ geneseq/geneseqn- 
/SIDSl/gcgdata/geneseq/geneseqn 
/ SIDSl/ gcgdata/ geneseq/geneseqn- 
/ SIDSl/ gcgdata/geneseq/geneseqn- 
/ SIDSl/ gcgdata/geneseq/geneseqn- 
/ SIDSl/ gcgdata/ geneseq/geneseqn- 
/ SIDSl/gcgdata/geneseq/geneseqn- 
/ SIDSl/ gcgdata/ geneseq/geneseqn- 
/ SIDSl/ gcgdata/ geneseq/geneseqn 
/SIDSl/ gcgdata/geneseq/geneseqn 
/ SIDSl/ gcgdata/ geneseq/geneseqn 
/SIDSl/ gcgdata/geneseq/geneseqn 
/ SIDSl/ gcgdata/geneseq/geneseqn 
/ SIDSl/ gcgdata/ geneseq/geneseqn 
/ SIDSl/gcgdata/ geneseq/geneseqn 
/ SIDSl/ gcgdata/geneseq/geneseqn 
/SIDSl/gcgdata/geneseq/geneseqn 
/SIDSl/ gcgdata/geneseq/geneseqn 
/ SIDSl/gcgdata/geneseq/ geneseqn 
/ SIDSl/gcgdata/geneseq/geneseqn 
/ SIDSl/ gcgdata/geneseq/geneseqn 
/ SIDSl/ gcgdata/ geneseq/geneseqn 
/ SIDSl/gcgdata/geneseq/geneseqn 
/ SIDSl/gcgdata/geneseq/geneseqn 



embl/NA1980.DAT; * 
embl/NA1981 . DAT: ^ 
embl/NA1982 . DAT : * 
embl /NAl 983. DAT : * 
embl/NA1984 . DAT: * 
embl/NA1985.DAT: * 
embl/NA1986. DAT: * 
embl/NA1987 . DAT: * 
embl/NA1988 .DAT:* 
embl/NA1989 . DAT : * 
embl /NAl 990. DAT : * 
L-embl/NA1991 . DAT : * 
embl/NA1992 . DAT : * 
-embl/NA1993 . DAT : * 
embl/NA1994 . DAT: * 
embl/NA1995 . DAT : * 
.-embl/NA1996. DAT: * 
.-embl/NA1997.DAT: * 
embl/NA1998 . DAT : * 
embl/NA1999.DAT: * 
embl/NA2000.DAT: * 
embl /NA2 0 0 lA . DAT : * 
embl /NA2 0 0 IB . DAT : * 
-embl/NA2 002 . DAT : * 
embl/NA2003.DAT: * 



Pred. No. is the niimber of results predicted by chance to have a 



score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 
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41 
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Bovine glial cell 
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AAT48088 


Human neuregulin G 
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Nucleotide sequenc 



ALIGNMENTS 



RESULT 1 
AAS18020 

ID AAS18020 standard; cDNA; 897 BP. 
XX 

AC AAS18020; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human cDNA encoding Neuregulin-2beta, NRG-2beta. 
XX 

KW Human; ss; neuregulin-2 ; NRG-2alpha; NRG-2beta; mitogenesis; 

KW cell survival; cell growth; cell differentiation; erbB receptor; 

PCW cardiomyopathy; ischaemic damage; cardiac trauma; heart failure; 

KW atherosclerosis; vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 

KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer's disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 1..8 97 

FT /*tag= a 

FT /product^ "NRG-2beta" 

XX 

PN WO200189568-A1. 
XX 

PD 29-NOV-2001. 
XX 

PF 23-MAY-2001; 2001WO-US16896 . 
XX 

PR 23-MAY-2000; 2000US-206495P 
XX 

PA (CENE-) CENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR P-PSDB; AAU11636. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating 

PT multiple sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's 

PT disease, by increasing mitogenesis, survival, growth or differentiation 

PT of a cell - 

XX 

PS Claim 57; Fig 8; 79pp; English. 
XX 

CC The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human 

CC NRG-2alpha or NRG-2beta (clone 2b7) and the polynucleotides encoding 

CC the. Also included are a vector expressing the protein, a host cell 

CC comprising the vector, a transgenic non-human animal transformed with 



CC the vector or having a knockout mutation in one or both NRG-2 

CC alleles and an anti-NRG-2 antibody. Analysis of mutations in NRG-2 in an 

CC individual is useful for diagnosing an increased likelihood of 

CC developing a NRG-2-related disease or condition in a test subject. 

CC NRG-2 is useful for increasing the mitogenesis, survival, growth or 

CC differentiation of a cell (e.g. a neuronal cell), where the cell 

CC expresses an erbB receptor. NRG-2 is useful for treating diseases 

CC . and disorders such as cardiomyopathy (preferably degenerative congenital 

CC disease) , ischaemic damage, cardiac trauma or heart failure or which 

CC has a condition affecting smooth muscle which include atherosclerosis, 

CC vascular lesion, vascular hypertension, and degenerative congenital 

CC vascular disease, myasthenia gravis, a neurodegenerative disorder, 

CC peripheral neuropathy, a sensory nerve fiber neuropathy, a motor fiber 

CC and a sensory nerve fiber neuropathy, multiple sclerosis, amyotrophic 

CC lateral sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's 

CC disease, Parkinson's disease, cerebellar ataxia, and spinal cord injury. 

CC The antibody is useful for treatment of a tumour comprising inhibiting 

CC proliferation of a tumour cell preferably a glial tumour cell, for 

CC treating of neurofibromatosis by inhibiting glial cell mitogenesis. 

CC The present sequence encodes NRG-2beta. 

XX 

SQ Sequence 8 97 BP; 200 A; 261 C; 282 G; 154 T; 0 others- 
Query Match 100.0%; Score 897; DB 24; Length 897; 
Best Local Similarity 100.0%; Pred. No. 2.4e-199; 

Matches 897; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I [ I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACTUVGGCACCCGTGGTGGTGGAG 12 0 

I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I 
Db 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACTyVGTGGCCGCTCCGGAGCGGG 240 

I I I I I I I I I I I I II II I I I I I I I M I I I M II I I I I I I I I I I I I I I I I I I II I I I II I II 
Db 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I II I I I I I I II I I I I I I II I I I II I II II I I II I I I I I I I I II I I I I I I II I II II I I I 

Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGT^AAGGAACCAG 30 0 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I II I I I II II I II I I I III I I I I I I I I I I I I II I I I I I II I I I II I I II I I I I I I II 
Db 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

Qy 361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 42 0 

I I M II I I I I I II I I I I I II I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I II M I I I 
Db 361 CCCCTCGATACCAACGGCAAAAATCTCTWGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 42 0 



Qy 



421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 48 0 

M I I I I I I M I I M II I I I I II I I I I I I I I I I I I I I M I I I I I I I I I M I I I I I M M I I 



Db 


421 


TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 


480 




4 81 


CAAT C GCT GAAGT GT GAGGC AGCAGC C GGT AAT C C C CAGC CT T C CT AC C GT T GGT T CTVAG 


540 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 




Db 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 


540 




J *i X 


r:; AT (^nr A A A r^PT r A A r r Gr AGC r G AGAC AT T C GC AT CAAAT AT G GCAAC GGC AGAAAG 


600 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


GATGGCAAGGAGCTC7\ACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 


600 




DUX 


AATTP ArnArTArAnTTrAArAAGGTGAAGGTGGAGGArGCTGGGGAGTATGTCTGCGAG 


660 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1.1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


601 


AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 


660 


^y 


O 0 1 


rrrrnr AAfATrTTPnpriAAriPArArrnTrrGGGGrr'Gf^rTTTArGTrAArAGrGTGAGC 


720 






1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


661 


GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 


720 


Qy 


T? 1 
/ X 


Ar^r'nrnr'Tr'Tr ATPPTnnTrr;nnnr*Arf^rr'r(^(^AAf^T(^p AArGAGATAGrrAAGTrcTAT 


780 




1 1 M 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


721 


ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 


780 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 


840 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 


840 


Qy 


841 


CCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTCCTAA 8 97 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 i 1 1 1 1 1 1 1 1 1 1 1 




Db 


841 


CCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTCCTAA 897 



RESULT 2 
7^318019 

ID 7\AS18019 standard; cDNA; 994 BP. 
XX 

AC AAS18019; 
XX 

DT 12-MAR-2002 (first entry) 
XX 

DE Human cDNA encoding Neuregulin-2alpha, NRG-2alpha. 
XX 

KW Human; ss; neuregulin-2; NRG-2alpha; NRG-2beta; mitogenesis; 

KW cell survival; cell growth; cell differentiation; erbB receptor; 

KW cardiomyopathy; ischaemic damage; cardiac trauma; heart failure; 

KW atherosclerosis; vascular lesion; vascular hypertension; 

KW degenerative congenital vascular disease; myasthenia gravis; 

KW neurodegenerative disorder; peripheral neuropathy; 

KW sensory nerve fiber neuropathy; motor fiber neuropathy; 

KW sensory nerve fiber neuropathy; multiple sclerosis; 

KW amyotrophic lateral sclerosis; spinal muscular atrophy; nerve injury; 

KW Alzheimer's disease; Parkinson's disease; cerebellar ataxia; 

KW spinal cord injury; tumour; neurofibromatosis; transgenic animal, 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 1..993 



FT /*tag= a 

FT /product^ "NRG-2alpha" 

XX 

PN WO200189568-A1. 
XX 

PD 29-NOV-2 001. 
XX 

PF 23-MAY-2001; 2001WO-US16896. 
XX 

PR 23-MAY-2000; 2000US~206495P . 
XX 

PA (CENE-) CENES PHARM INC. 
XX 

PI Marchionni MA; 
XX 

DR WPI; 2002-097612/13. 

DR P-PSDB; AAU11635. 
XX 

PT Neuregulin-2 polypeptide and polynucleotide useful for treating 

PT multiple sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's 

PT disease, by increasing mitogenesis, survival, growth or differentiation 

PT of a cell - 

XX 

PS Claim 57; Fig 6; 79pp; English. 
XX 

CO The invention relates to a substantially pure neuregulin (NRG) -2 

CC polypeptide comprising or consisting of a sequence for human 

CC NRG-2alpha or NRG-2beta (clone 2b7) and the polynucleotides encoding 

CC the. Also included are a vector expressing the protein, a host cell 

CC comprising the vector, a transgenic non-human animal transformed with 

CC the vector or having a knockout mutation in one or both NRG-2 

CC alleles and an anti-NRG-2 antibody. Analysis of mutations in NRG-2 in an 

CC individual is useful for diagnosing an increased likelihood of 

CC developing a NRG-2-related disease or condition in a test subject. 

CC NRG-2 is useful for increasing the mitogenesis, survival, growth or 

CC differentiation of a cell (e.g. a neuronal cell), where the cell 

CC expresses an erbB receptor. NRG-2 is useful for treating diseases 

CC and disorders such as cardiomyopathy (preferably degenerative congenital 

CC disease) , ischaemic damage, cardiac trauma or heart failure or which 

CC has a condition affecting smooth muscle which include atherosclerosis, 

CC vascular lesion, vascular hypertension, and degenerative congenital 

CC vascular disease, myasthenia gravis, a neurodegenerative disorder, 

CC peripheral neuropathy, a sensory nerve fiber neuropathy, a motor fiber 

CC and a sensory nerve fiber neuropathy, multiple sclerosis, amyotrophic 

CC lateral sclerosis, spinal muscular atrophy, nerve injury, Alzheimer's 

CC disease, Parkinson's disease, cerebellar ataxia, and spinal cord injury. 

CC The antibody is useful for treatment of a tumour comprising inhibiting 

CC proliferation of a tumour cell preferably a glial tumour cell, for 

CC treating of neurofibromatosis by inhibiting glial cell mitogenesis. 

CC The present sequence encodes NRG-2alpha. 

XX 

SQ Sequence 994 BP; 230 A; 279 C; 304 G; 181 T; 0 other; 



Query Match 94.6%; Score 849; DB 24; Length 994; 

Best Local Similarity 98.3%; Pred. No. 3.6e-188; 

Matches 858; Conservative 0; Mismatches 15; Indels 0; Gaps 



Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I M I I I I M II I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACT^AGGCACCCGTGGTGGTGGAG 12 0 

M I II I I II II II I I I I I I I II I I I I II II II II II II M I I I I M 111 I II I I II II II 
Db 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

II I I II I I I I I I I I II I I I II I I I I I I II I I I II II I I II II I II I I I I I I II I I I I I I I 

Db 121 GGC7\AGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I 
Db 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I II II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTT7VAGACGGCCTTTGCC 360 

Qy 361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db ■ 361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 42 0 

Qy 421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 48 0 

Qy . 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTC7\AG 54 0 

I I I I I I I I I I I I I ! I I I I I I I I I I I I l-l I I I I I I I I I I I I I I I I I I I I I I I I n I I I I II 
Db 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

Qy 601 AACTCACGACTACAGTTCAACTiLAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

Qy 661 GCCGAG7\ACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC7\ACAGCGTGAGC 72 0 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCJ\ACAGCGTGAGC 72 0 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 



Db 



II I I I I I Ml I I I I I I III 

841 CCAAATGGATTCTTCGGACAGAGATGTTTGGAG 873 



RESULT 3 
AAV17814 
ID 
XX 



AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 
PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



AAV17814 standard; cDNA; 1884 BP. 
AAV17814; 

17-AUG-1998 (first entry) 

Homo sapiens don-1 gene splice variant. 

Murine; don-1 gene; melanoma; treatment; adenocarcinoma; 
epithelial cell; proliferation; stimulation; treatment; tumours; 
skin; oesophagus; lung; breast; liver; pancreas; colon; prostate; 
gastrointestinal tract; uterus; wound healing; transmembrane; ss. 



Homo sapiens. 

Key 
CDS 



WO9807736-A1. 
26-FEB-1998. 

18- AUG~1997; 

19- NOV-1996; 
19-AUG-1996; 



Location/Qualifiers 
664. .1884 
/*tag— a 

/note= "don-1 polypeptide" 



97WO-US14585. 

96US-0753007. 
96US-0699591. 



(MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 

Bus field SJ, Gearing DP; 

WPI; 1998-169084/15. 
P-PSDB; AAW48381. 

Mouse and human don-1 polypeptide ( s ) - useful for treatment of 
melanomas and adenocarcinoma ( s ) ^ and for wound healing 

Claim 4; Fig 3; 121pp; English. 

The sequence is that of a human don-1 gene splice variant. 
Don-1 polypeptides stimulate proliferation of epithelial cells 
and thus are implicated in melanomas and adenocarcinomas in which 
epithelial cells proliferate out of control. Compounds that 
interfere with don-1 mediated cell proliferation can be used 
in the treatment of tumours such as melanomas and adenocarcinomas 
of the skin, oesophagus, lung, breast, liver, pancreas, 
gastrointestinal tract, colon, prostate or uterus. Alternatively, 
don-1 polypeptides can be used to stimulate epithelial cell 
proliferation, e.g. for wound healing. 



XX 

SQ Sequence 1884 BP; 426 A; 607 C; 560 G; 291 T; 0 other; 



Query Match 93.1%; Score 835.4; DB 19; Length 1884; 

Best Local Similarity 98.1%; Pred. No. 6.2e-185; 

Matches 856; Conservative 0; Mismatches 16; Indels 1; Gaps 1; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 r I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 278 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 338 GGC7\AGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I M I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ■ 
Db 398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

Qy 361 CCCCTCGATACCAACGGCAAAAATCTCAAGTU^GAGGTGGGCAAGATCCTGTGCACTGAC 42 0 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 
Db 578 CCCCT-GATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCT^GATCCTGTGCACTGGC 636 

Qy 421 TGCGCCACCCGGCCCAAGTTGAAGAAGATG7\AGAGCCAGACGGGACAGGTGGGTGAGAAG 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 637 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 696 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I 1 I I I I I I I I I I I I I I I I 
Db 697 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 
Db 757 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGA7\AG 816 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 817 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 87 6 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 877 GCCGAGT^ACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 93 6 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



937 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 996 



Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT -84 0 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II Mill III I I I I I I III 

Db 1057 CCAAATGGATTCTTCGGACAGAGATGTTTGGAG 1089 



RESULT 4 
AAT87922 

ID AAT87922 standard; cDNA; 3441 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
XX 
PA 
PA 
PA 
XX 
PI 



BP. 



AAT87922; 

18-DEC-1997 (first entry) 

Rat cerebellum derived growth factor 1 cDNA. 

Rat; cerebellum derived growth factor; CDGFl ; screening; binding; 
modulation; erbB type receptor; identification; indication; risk; 
proliferation; differentiation; induction; neuron; hyperplasia; 
stem cell culture; intracerebral graft; alleviation; repair; 
behavioural defect; nervous system; central; peripheral; nerve; 
prothesis; damage; entubulation; cell survival; treatment; 
injury; trauma; ischaemia; ischemia; stroke; infection; disorder; 
inflammation; neurodegeneration; disease; Parkinson's; 
Huntingdon's; amylotrophic lateral sclerosis; sensory; retina; 
spinocerebellar degeneration; multiple sclerosis; neoplasia; 
amalignant glioma; medulloblas toma; neuroectodermal tumour; ds . 



Rattus rattus . 

Key 
CDS 

sig_peptide 
mat__peptide 



Location/Qualifiers 
180. .2444 
/*tag= a 
180. .248 
/*tag= b 
249. .2441 
/*tag== c 
/product^ 



cerebellum_derived_growth factor 



WO9709425~Al. 
13-MAR-1997. 
09-SEP-1996; 
08-SEP-1995; 



96WO-US14484. 



95US-0525864, 



(HARD 
(STRD 
(STRD 



HARVARD COLLEGE. 
UNIV LELAND STANFORD JUNIOR. 
UNIV LELAND S STANFORD. 



Chang H; 



XX 

DR WPI; 1997-192900/17. 

DR P-PSDB; 7\AW27536. 
XX 

FT Rat and human cerebellum-derived growth factors - used in the 

FT treatment of neuronal injury and proliferative disorders 

XX 

PS Claim 17; Pages 63-66; 94pp; English. 
XX 

CG The present sequence encodes , rat cerebellum derived growth factor 1 

CC (CDGFl) , which can be used to screen for modulators of CDGF 

CC binding to erbB type receptors. Identification of a modification or 

CC mutation in a CDGF gene, or aberrant expression of a CDGF gene or 

CC levels of soluble CDGF may be used to indicate the risk of unwanted 

CC cell proliferation or differentiation, 

CC CDGF may be used to induce neuronal differentiation in stem cell 

CC culture, and maintain the integrity of a terminally differentiated 

CC neuronal cell culture, e.g. useful for intracerebral grafting to 

CC alleviate behavioural defects. CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially 

CC where a crushed or severed axon is entubulated by a prosthetic. 

CC CDGF may also be used to enhance neuronal cell survival in the 

CC central or peripheral nervous system, to' treat neurological 

CC conditions associated with nervous system injury, e.g. traumatic, 

CC chemical or vasal injury and deficits such as ischaemia resulting 

CC from stroke, infectious/inflammatory and tumour induced injury, 

CC chronic neurodegenerative disease including Parkinson's and 

CC Huntingdon's, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the 

CC central nervous system, e.g. amalignant gliomas, medulloblas tomas 

CC and neuroectodermal tumours . 

XX 

SQ Sequence 3441 BP; 777 A; 1057 C; 1015 G; 592 T; 0 other; 

Query Match 87.4%; Score 784; DB 18; Length 3441; 

Best Local Similarity 92.2%; Pred. No. 6.7e-173; 

Matches 82 6; Conservative 0; Mismatches 70; Indels 0; Gaps 0 
Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 



Db 



180 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 239 



Qy 



Db 



61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 

24 0 TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 299 



Qy 



121 GGCT^AGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 



Db 



300 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 359 



Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 360 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 419 



Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGG7\ACCAG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I M I I I I 
Db 42 0 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 47 9 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTT^GACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I M I I I I I I I II I I II I I I II II I I I I I I I I I I I I I I I I I 

Db 480 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 539 

Qy . 361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 42 0 

II MM I M II II II I II II M II II I I I I I II II I II I I I I II I I II II I II 

Db 54 0 CCGGTCGACCCTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 599 

Qy 421 TGCGCCACCCGGCCCAAGTTGT^AGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 480 

I I I I I I I I II II II I I I M II I I II I II I II II II II I III II II II I I I I II I 

Db 600 TGCGCAACCCGGCCCAAGCTGAAGAAGATGAAGAGTCAGACAGGAGAGGTGGGCGAGAAG 659 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

II I II II I I II II I II II II II II II I I II I II I I I I I I II II II II I I I 

Db 660 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 719 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

II I I I I I I II II I M II I I II II I I I M II I I II II I I I I II II II II I II II II 
Db 720 GACGGCAAGGAGCTCAACCGGAGTCGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 77 9 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGeTGGGGAGTATGTCTGCGAG 660 

II II II I I I I II II I II II I I I II II M II II I II I I I I I II I I I I I II II I III 
Db 78 0 AACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTGTGAG 83 9 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

II I I I II I I I II I I I II I I II I I I II I II II II II I I II II II I I MUM 
Db 84 0 GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 89 9 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 

' Mill II I I I II II II II II II II II II II II I II I I II I II I II M II II II I I I 
Db 900 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCTU^TGAGACAGCCAAGTCCTAC 959 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

M II I I I II II I II I II I II II II II II I M I II II II I I II II II I I I M II II 
Db 960 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAGTGT 1019 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTCCTA 8 96 

II I I II I I Ml II II I I II I I II I I II I II II II II I I I II II II I I I I I M I I I 
Db 1020 CCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTCCAA 1075 



RESULT 5 
AAT87923 

ID AAT87923 standard; cDNA; 1803 BP. 
XX 

AC AAT87923; 
XX 

DT 18-DEC-1997 (first entry) 
XX 

DE Rat cerebellum derived growth factor 2 cDNA. 
XX 

KW Rat; cerebellum derived growth factor; CDGF2; screening; binding; 
KW modulation; erbB type receptor; identification; indication; risk; 



KW proliferation; differentiation; induction; neuron; hyperplasia; 

KW stem cell culture; intracerebral graft; alleviation; repair; 

KW behavioural defect; nervous system; central; peripheral; nerve; 

KW prothesis; damage; entubulation; cell survival; treatment; 

KW injury; trauma; ischaemia; ischemia; stroke; infection; disorder; 

KW inflammation; neurodegeneration; disease; Parkinson's; 

KW Huntingdon's; amylotrophic lateral sclerosis; sensory; retina; 

KW spinocerebellar degeneration; multiple sclerosis; neoplasia; 

KW amalignant glioma; medulloblastoma; neuroectodermal tumour; ds . 

XX 

OS Rattus rattus . 
XX 

FH Key Location/Qualifiers 

FT CDS l.,993 

FT /*tag- a 

FT sig_peptide 1 . . 69 

FT /*tag= b 

FT mat__peptide 70.. 990 

FT /^tag= c 

FT /product^ cerebellum_derived_growth_f actor 
XX 

PN WO9709425-A1. 
XX 

PD 13-MAR-1997. 
XX 

PF 09-SEP-1996; 96WO-US14484 , 
XX 

PR 08-SEP-1995; 95US-0525864 . 
XX 

PA (HARD ) HARVARD COLLEGE. 

PA (STRD ) UNIV LELAND STANFORD JUNIOR. 

PA (STRD ) UNIV LELAND S STANFORD. 

XX 

PI Chang H; 
XX 

DR WPI; 1997-192900/17. 

DR P-PSDB; AAW27537, 
XX 

PT Rat and human cerebellum-derived growth factors - used in the 

PT treatment of neuronal injury and proliferative disorders 

XX 

PS Claim 17; Pages 70-74; 94pp; English. 
XX 

CC The present sequence encodes rat cerebellum derived growth factor 2 

CC (CDGF2), which can be used to screen for modulators of CDGF 

CC binding to erbB type receptors. Identification of a modification or 

CC mutation in a CDGF gene, or aberrant expression of a CDGF gene or 

CC levels of soluble CDGF may be used to indicate the risk of unwanted 

CC cell proliferation or differentiation. 

CC CDGF may be used to induce neuronal differentiation in stem cell 

CC culture, and maintain the integrity of a terminally differentiated 

CC neuronal cell culture, e.g. useful for intracerebral grafting to 

CC alleviate behavioural defects . CDGF may also be used in nerve 

CC protheses to repair central and peripheral nerve damage, especially 

CC where a crushed or severed axon is entubulated by a prosthetic. 

CC CDGF may also be used to enhance neuronal cell survival in the 

CC central or peripheral nervous system, to treat neurological 



CC conditions associated with nervous system injury, e.g. traumatic, 

CC chemical or vasal injury and deficits such as ischaemia resulting 

CC from stroke, infectious/inflammatory and tumour induced injury, 

CC chronic neurodegenerative disease including Parkinson's and 

CC Huntingdon's, amylotrophic lateral sclerosis, spinocerebellar 

CC degeneration, chronic immunological disease of the nervous system 

CC including multiple sclerosis, disorders of the sensory neurons and 

CC degenerative diseases of the retina. CDGF may also be used to treat 

CC neoplastic or hyperplastic transformations, particularly of the 

CC central nervous system, e.g. amalignant gliomas, medulloblastomas 

CC and neuroectodermal tumours . 
XX 

SQ Sequence 1803 BP; 408 A; 549 C; 537 G; 309 T; 0 other; 

Query Match 82.3%; Score 738.6; DB 18; Length 1803; 

Best Local Similarity 90.4%; Fred. No. 2.2e-162; 

Matches 789; Conservative 0; Mismatches 84; Indels 0; Gaps 0; 
Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I M I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M 

Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 60 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I M I I I I I 

Db 61 TACTCGCCCAGCCTC7\AGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I III II II M I I I I I I I I I I I I II I I I I I I I I I II I 

Db 121 GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 18 0 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I II I I I I I I I I I I I II I I I I I I 

Db 181 CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

Qy . . 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I II I I I I I II I I II I M I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I 
Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCG7U\AGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGG7\ACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I II II I I I I I I I I I II I II I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I II 

Db 301 CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTAAGACAGCCTTTGCC 360 

Qy 361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 42 0 

II I I I I I I I I I I I I I I II I I I I I I I M I I I M I II I I I I I I I I I I II I I I I I I 

Db 361 CCGGTCGACCCTAACGGCAAAAACATCAAG7WVGAGGTGGGC7\AGATCCTGTGCACTGAC 420 

Qy 421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 480 

I I I I I I I I I I I I II I I I I I I I I I I M I i I I I I I I I I I I II I I I I I I M MINI 

Db 421 TGCGCAACCCGGCCCAAGCTGAAGAAGATGAAGAGTCAGACAGGAGAGGTGGGCGAGAAG 480 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

II I I I I I I I I I I I II I I I M II II II I I I I I I I I I I I I I II I I I I I I I I I 

Db 481 CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 54 0 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

II I I I I I I II I II I II I II II II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 541 GACGGCAAGGAGCTCAACCGGAGTCGTGACATTCGCATCAAGTATGGCAACGGCAGAAAG 600 



Qy 

Db 



601 
601 



660 
660 



Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

II I II I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I 
Db 661 GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 720 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I I Mill II I I I I I II I I I I I I I I I I I I I II I I I I I I II I I II I II I I I I I II 

Db 721 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGCAATGAGACAGCCAAGTCCTAC 78 0 

Qy 7 81 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

II II I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I II I I I I I I I I I I I III 

Db 781 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 84 0 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II I II I I III I I I I I I III 

Db 841 CCAAACGGATTCTTCGGACAGAGATGTTTGGAG 873 



RESULT 6 
AAV43674 

ID AAV43674 standard; cDNA; 3076 BP. 
XX 

AC AAV43674; 
XX 

DT 29-SEP-1998 (first entry) 
XX 

DE Receptor type tyrosine kinase ErbB ligand encoding cDNA. 
XX 

KW Receptor type tyrosine kinase ErbB; ligand; diagnostic agent; 

KW nervous disease; cancer; ss. 

XX 

OS Rattus sp. 
XX 

FH Key Location/Qualifiers 
FT CDS 232.. 2814 

FT /*tag= a 

FT /product= "ligand of receptor type tyrosine kinase ErbB" 

XX 

FN JP10179166-A. 
XX 

PD 07-JUL-1998. 
XX 

PF 25-DEC-1996; 96 JP-0356998 . 
XX 

PR 25-DEC-1996; 96 JP-0356998 . 
XX 

PA (HIGA/) HIGASHIY7\MA S. 
XX 

DR WPI; 1998-430952/37. 
DR P-PSDB; AAW63700. 
XX 

PT Gene coding the ligand of the tyrosine kinase ErbB receptor - useful 
PT for diagnosing and treating nervous diseases and cancer 



XX 

PS Examples; Pages 9-13; 17pp; Japanese. 
XX 

CC This cDNA encodes the ligand of receptor type tyrosine kinase ErbB. A 

CC prokaryotic or eukaryotic host cell transformed by a recombinant vector 

CC containing the encoding DNA can be used for the recombinant production of 

CC the protein. The invention provides a method for inhibiting the formation 

CC of the ligand of receptor type tyrosine kinase ErbB in an animal using 

CC an antibody recognizing the protein. The ligand of the tyrosine kinase 

CC ErbB receptor and associated materials can be used for treating or 

CC diagnosing nervous diseases and cancers. 
XX 

SQ Sequence 3076 BP; 673 A; 996 C; 944 G; 463 T; 0 other; 



Query Match 82,2%; Score 737; DB 19; Length 3076; 

Best Local Similarity 90.3%; Pred. No. 5.8e-162; 

Matches 788; Conservative 0; Mismatches 85; Indels 0; Gaps 0; 



Qy 


1 


ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 


60 






M 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 III II 1 1 1 1 II 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 M 




Db 


556 


ATGAGGCGCGACCCGGCCCCCGGCTCCTCGATGCTGCTCTTCGGTGTGTCACTCGCCTGC 


615 


Qy 


61 


TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


120 






1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


616 


TACTCGCCCAGCCTCAAGTCCGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 


675 


Qy 


12 1 


GGCT^GGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 


180 






llllllllllllll llll III II iiiii iiiiiiti iiiiiipiiiitiiiit 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 llll III II IIIII 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


676 


GGCAAGGTACAGGGACTGGCCCCGGCAGGCGGTTCCAGCTCTAACAGCACCCGAGAGCCT 


735 


Qy 


181 


CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


240 






1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 llll 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


736 


CCCGCCTCGGGTCGGGTGGCGCTGGTGAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 


795 


Qy 


241 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 


300 






II 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


796 


GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGCGCGCCGCTCGAAAGGAACCAG 


855 


Qy 


301 


CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 


360 






1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


856 


CGCTACATCTTTTTCCTGGAGCCCACCGAGCAGCCCTTAGTTTTTT^GACAGCCTTTGCC 


915 


Qy 


361 


CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCTU^GATCCTGTGCACTGAC 


420 






II 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


916 


CCGGTCGACCCTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 


975 


Qy 


421 


TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IIIII III 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


976 


TGCGCAACCCGGCCCAAGCTGAAGAAGATGAAGAGTCAGACAGGAGAGGTGGGCGAGAAG 


1035 


Qy 


481 


CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 


540 






II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II II M 1 M 1 1 1 1 1 IIIII II 1 1 1 1 1 1 Lll 




Db 


1036 


CAGTCGCTCAAGTGTGAGGCGGCGGCGGGGAACCCCCAGCCCTCCTATCGATGGTTCAAG 


1095 



Qy 

Db 



541 
1096 



600 
1155 



Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 1156 AACTCACGGCTACAGTTCAACAAAGTGAAGGTGGAGGACGCTGGAGAGTACGTCTGTGAG 1215 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

M I I I I M I I I I I M I I I I I I I II II I I I I I I I I I I I 11111111111111 
Db 1216 GCTGAGAACATCCTTGGGAAGGACACTGTGAGGGGCCGGCTCCATGTCAACAGTGTGAGC 1275 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 7 80 

I I I I I Mill I I I I I I I M I I I I II I I M I II I II M I I I I I I I I I I I I I I I I I I I 

Db 127 6 ACCACTCTGTCGTCCTGGTCGGGGCACGCCCGGAAGTGC7KATGAGACAGCC7\AGTCCTAC 1335 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

II II I I I I II I I M I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I III 

Db 1336 TGTGTGAATGGAGGCGTGTGCTACTACATCGAAGGCATCAACCAACTCTCCTGCAAATGT 1395 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II I I I I I II I I I I I I I I I I 

Db 1396 CCAAACGGATTCTTCGGACAGAGATGTTTGGAG 142 8 



RESULT 7 
ABS56035 

ID ABS56035 standard; cDNA; 1863 BP. 
XX 

AC ABS56035; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding human membrane-bound splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; 

KW brain; vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens , 
XX 

FH Key Location/Qualifiers 
FT CDS 643., 1863 

FT /*tag= a 

FT /partial 

FT /product^ "Membrane-bound splice variant of Don-1" 

FT /note= "This sequence lacks a stop codon" 

XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241 . 
XX 

PR 22-JUN-2000; 2000US-059978 9 . 
XX 

PA (GEAR/) GEARING DP. 
PA (BUSF/) BUSFIELD S J. 



XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71638. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumour formation and progression in brain - 
XX 

PS Claim 4; Fig 3; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene 

CC called Don-1, and alternate splice variants of Don-1, which are 

CC related to epidermal growth factors (EGF) such as neuregulins . 

CC Don-1 polypeptides are glycoprotein ligands . Both murine and human 

CC Don-1 sequences are cloned. The mouse Don-1 gene maps to chromosome 18. 

CC Don-1 polypeptides are useful for stimulating proliferation of a cell. 

CC Antibodies to Don-1 polypeptides are useful for detecting Don-1 

CC in a sample. The Don-1 polypeptides are useful for treating and 

CC diagnosing cell proliferative disorders and play a role in the 

CC proliferation of carcinomas e.g. adenocarcinoma, myeloma, in cell 

CC differentiation, proliferation and survival. The polypeptides are 

CC also useful for inhibiting proliferation of adenocarcinoma cells, 

CC . for stimulating the proliferation of cells such as epithelial cells 

CC to promote wound healing, for identifying proteins that interact 

CC with Don-1, and for regulating tumour formation and progression in 

CC the brain. The polynucleotide sequences encoding Don-1 may be used 

CC in gene therapy. The present sequence encodes human membrane-bound 

CC splice variant of Don-1. 

XX 

SQ Sequence 1863 BP; 422 A; 602 C; 553 G; 286 T; 0 other; 

Query Match 80.6%; Score 723.4; DB 25; Length 1863; 

Best Local Similarity 96.2%; Pred. No. 7.5e-159; 

Matches 840; Conservative 0; Mismatches 16; Indels 17; Gaps 9; 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 213 ATGAGGCGCGACCCGGCCCCC — CTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 27 0 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 271 TACTCGCCCAGCCTCAAGTCA — GCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 32 8 

Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 32 9 GGCAAGGTACAGGGGCTGGT — CAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 386 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 387 CCCGCCTCGGGTCGGGTGGCG — GGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 444 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 445 GGGCTGCAGCGCGAGCAGGTG — CAGCGTGGGCTCCTGTGTGCCGCTCGTWVGGAACCAG 502 



Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I 
Db 503 CGCTACATCTTTTTCCTGGAG — CACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 560 

Qy 361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGC7V?VGATCCTGTGCACTGAC 42 0 

Mill II II II II II II II II I II I I II II I I I I II I I I I I M II I I I I I I I I I I I 

Db 561 CCCCT-GATACCAACGGCAAAA — CTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 617 

Qy 421 TGCGCCACCCGGCCC7V?VGTTG7\AG7y\GATG7yVGAGCCAGACGGGACAGGTGGGTGAGAAG 48 0 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I II I I I I I I I II I I I I I I I I 
Db 618 TGCGCCACCCGGCCCAAGTTGA — AAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 67 5 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 67 6 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 735 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 736 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 795 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 7 96 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 855 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 856 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 915 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCC7\AGTCCTAT 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 916 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 975 

Qy 781 TGCGTCTVATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 97 6 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1035 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II I I I I I III I I I I I I III 

Db 1036 CCAAATGGATTCTTCGGACAGAGATGTTTGGAG 1068 



RESULT 8 
AAV17816 

ID AAV17816 standard; cDNA; 2268 BP. 
XX 

AC AAV17816; 
XX 

DT 17-AUG-1998 (first entry) 
XX 

DE Homo sapiens don-1 gene splice variant. 
XX 

KW Murine; don-1 gene; melanoma; treatment; adenocarcinoma; 

KW epithelial cell; proliferation; stimulation; treatment; tumours; 

KW skin; oesophagus; lung; breast; liver; pancreas; colon; prostate; 

KW gastrointestinal tract; uterus; wound healing; transmembrane; ss. 

XX 

OS Homo sapiens. 



XX 






FH 


Key 


Location/Qualifiers 


FT 


CDS 


69. .2012 


FT 




/*tag= a 


FT 




/note== "don-1 polypeptide" 


XX 






PN 


WO9807736-A1. 




XX 






PD 


26-FEB-1998. 




XX 






PF 


18-AUG-1997; 


97WO-US14585. 


XX 






PR 


19-NOV-1996; 


96US-075300*7. 


PR 


19-AUG-1996; 


96US-0699591. 


XX 






PA 


(MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 


XX 






PI 


Busfield SJ, 


Gearing DP; 


XX 






DR 


WPI; 1998-169084/15. 


DR 


P-PSDB; AAW48 


383. 


XX 






PT 


Mouse and human don-1 polypeptide ( s) - useful for treatment of 


PT 


melanomas and 


adenocarcinoma ( s ) , and for wound healing 


XX 






PS 


Claim 4; Fig 


7; 121pp; English. 


XX 






cc 


The sequence 


is that of a human don-1 gene splice variant. 


cc 


Don-1 polypeptides stimulate proliferation of epithelial cells 


cc 


and thus are 


implicated in melanomas and adenocarcinomas in which 


cc 


epithelial cells proliferate out of control. Compounds that 


cc 


interfere with don-1 mediated cell proliferation can be used 


cc 


in the treatment of tumours such as melanomas and adenocarcinomas 


cc 


of the skin. 


oesophagus, lung, breast, liver, pancreas. 


cc 


gastrointestinal tract, colon, prostate or uterus. Alternatively, 


cc 


don-1 polypeptides can be used to stimulate epithelial cell 


cc 


proliferation 


, e.g. for wound healing. 


XX 






SQ 


Sequence 2258 


BP; 502 A; 735 C; 700 G; 331 T; 0 other; 



Query Match 47.7%; Score 427.8; DB 19; Length 2268; 

Best Local Similarity 89.8%; Pred. No. 5.4e-90; 

Matches 459; Conservative 0; Mismatches 52; Indels ■ 0; Gaps 0 

Qy 363 CCTCGATACCAACGGCAAAi\ATCTC7UVGA7^GAGGTGGGCAAGATCCTGTGCACTGACTG 422 

I I I I I M I II' I I I I I I I I II I I M 

Db 98 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 

Qy 423 CGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAG7VAGCA 4 82 

I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I II I II II II I II I I M I I I M II I I 
Db 158 AGCCACCCGGCCCAAGTTGT^AGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 54 2 

I II I I I II I I I I I I I I I II I I I I I I I I I II I I I II I I I I I I I I I I I I I I I II I I I II I I I 
Db 218 ATCGCTG7\AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 



Qy 



543 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 602 



I I I I I M I M I I I I I I I I I I I I I I I I I I M I M I I I I I I I I I I M I I I I I I I I I I I I I I I 

Db 278 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 

Db 338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 CGAGAACATCCTGGGG7\AGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I 

Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCT^CAGCGTGAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 7 83 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCC 842 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 TGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

I I I I I III I II II I III 
Db 57 8 7WVTGGATTCTTCGGACAGAGATGTTTGGAG 608 



RESULT 9 
ABS56036 

ID ABS56036 standard; cDNA; 1474 BP. 
XX 

AC ABS56036; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding human second splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

KW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; 

KW brain; vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens . 



XX 

FH Key Location/Qualifiers 

FT CDS 68. .1473 

FT /*tag^ a 

FT /partial 

FT /product= "Second splice variant of Don-1" 

FT /note= "This sequence lacks a stop codon" 

FT /transl_except- (pos : 107 . . 108, aa : Lys ) 

FT /note= "This codon has an apparent 1 nucleotide 

FT deletion which alters the reading frame" 

XX 



PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241 . 



XX 

PR 22-JUN-2000; 2000US-059978 9 . 
XX 

PA (GEAR/) GEARING DP. 
PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71639. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumour formation and progression in brain - 
XX 

PS Claim 4; Fig 4; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene 

CC called Don-1, and alternate splice variants of Don-1, which are 

CC related to epidermal growth factors (EGF) such as neuregulins. 

CC Don-1 polypeptides are glycoprotein ligands. Both murine and human 

CC Don-1 sequences are cloned. The mouse Don-1 gene maps to chromosome 18, 

CC Don-1 polypeptides are useful for stimulating proliferation of a cell. 

CC Antibodies to Don-1 polypeptides are useful for detecting Don-1 

CC in a sample. The Don-1 polypeptides are useful for treating and 

CC diagnosing cell proliferative disorders and play a role in the 

CC proliferation of carcinomas e.g. adenocarcinoma, myeloma, in cell 

CC differentiation, proliferation and survival. The polypeptides are 

CC also useful for inhibiting proliferation of adenocarcinoma cells, 

CC for stimulating the proliferation of cells such as epithelial cells 

CC to promote wound healing, for identifying proteins that interact 

CC with Don-1, and for regulating tumour formation and progression in 

CC the brain. The polynucleotide sequences encoding Don-1 may be used 

CC in gene therapy. The present sequence encodes human second 

CC splice variant of Don-1. 

XX 

SQ Sequence 1474 BP; 335 A; 472 C; 451 G; 216 T; 0 other; 



Query Match 47.6%; Score 426.8; DB 25; Length 1474; 

Best Local Similarity 92.4%; Pred. No. 8.3e-90; 

Matches 449; Conservative 0; Mismatches 37; Indels 0; Gaps 0; 

Qy 38 8 AAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCCGGCCCAAGTTG7\AGAAG 447 

I I I I I I I I II I I II I I I I II I I I II I I U I I II I I II I 

Db 121 AGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAAAGCCACCCGGCCCAAGTTGAAG7VAG 18 0" 

Qy 44 8 ATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCC 507 

I i I I I I I I I I I I I I I I II II I I I I I I I I I I I I I II I I I I I I I M I II II II I I I I I I I I I 
Db 181 ATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCC 24 0 

Qy 508 GGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 567 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I II I I I I M I I II I I I II I I I II I I 

Db 241 GGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 300 

Qy 568 GACATTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTG 627 

II I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I M I I I II I I 

Db 301 GACATTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTG 360 



Qy 628 AAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACC 687 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 AAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACC 420 

Qy 688 GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 747 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I M I I I I I I I I I I I 
Db 421 GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 48 0 

Qy 748 GCCCGGAAGTGCAACGAGACAGCC7VAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 807 

I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I M I I M I I I I I I I I I I M I I 
Db 481 GCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 54 0 

Qy 808 ATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCCTGTGGGATACACCGGGGACAGGTGT 867 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I III I I I I II 

Db 541 ATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAATGGATTCTTCGGACAGAGATGT 600 

Qy 868 CAGCAG 873 

I I I 

Db 601 TTGGAG 606 



RESULT 10 
ABS56045 

ID ABS56045 standard; cDNA; 2266 BP. 
XX 

AC ABS56045; 
XX 

DT 14-JAN-2003 (first entry) 
XX 

DE cDNA encoding human third splice variant of Don-1. 
XX 

KW Human; Don-1; epidermal growth factor; EGF; neuregulin; 

PCW glycoprotein ligand; cell proliferation; cell proliferative disorder; 

KW carcinoma; adenocarcinoma cell; myeloma; cell differentiation; 

KW cell survival; epithelial cell; wound healing; tumour formation; 

KW brain; vulnerary; cytostatic; gene therapy; gene; ss. 

XX 

OS Homo sapiens , 



XX 

FH Key Location/Qualifiers 

FT CDS 68. ,2010 

FT /*tag= a 

FT /product= "Third splice variant of Don-1" 

FT /transl_except= (pos : 107 . . 108 , aa:Lys) 

FT /note= "This codon has an apparent 1 nucleotide 

FT deletion which alters the reading frame" 

FT /transl_except= (pos : 994 . . 996, aa:Thr) 

XX 



PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241 . 
XX 

PR 22-JUN-2000; 2000US-0599789 . 
XX 



PA (GEAR/) GEARING D P. 
PA (BUSF/) BUSFIELD S J. 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 

DR P-PSDB; ABG71644. 
XX 

PT Novel Don-1 polypeptide useful for stimulating proliferation of cells, 

PT for identifying proteins that interact with Don-1, and for regulating 

PT tumour formation and progression in brain - 
XX 

PS Claim 4; Fig 7; 66pp; English. 
XX 

CC The present invention relates to the isolation of a novel gene 

CC called Don-1, and alternate splice variants of Don-1, which are 

CC related to epidermal growth factors (EGF) such as neuregulins . 

CC Don-1 polypeptides are glycoprotein ligands. Both murine and human 

CC Don-1 sequences are cloned. The mouse Don-1 gene maps to chromosome 18. 

CC Don-1 polypeptides are useful for stimulating proliferation of a cell. 

CC Antibodies to Don-1 polypeptides are useful for detecting Don-1 

CC in a sample. The Don-1 polypeptides are useful for treating and 

CC diagnosing cell proliferative disorders and play a role in the 

CC proliferation of carcinomas e.g. adenocarcinoma, myeloma, in cell 

CC- differentiation, proliferation and survival. The polypeptides are 

CC also useful for inhibiting proliferation of adenocarcinoma cells, 

CC for stimulating the proliferation of cells such as epithelial cells 

CC to promote wound healing, for identifying proteins that interact 

CC with Don-1, and for regulating tumour formation and progression in 

CC the brain. The polynucleotide sequences encoding Don-1 may be used 

CC in gene therapy. The present sequence encodes human third 

CC splice variant of Don-1, 

XX 

SQ Sequence 2266 BP; 502 A; 733 C; 700 G; 331 T; 0 other; 

Query Match 47.6%; Score 426.8; DB 25; Length 2266; 

Best Local Similarity 92,4%; Pred. No. 9.2e-90; 

Matches 449; Conservative 0; Mismatches 37; Indels 0; Gaps 0 

Qy 38 8 AAG7W^.GAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCCGGCCCAAGTTGAAGAAG 447 

I I I I I I I I II I I II I I I I II I I I I I I II II I II I I II I 

Db 121 AGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAAAGCCACCCGGCCCAAGTTGAAGAAG 180 

Qy 44 8 ATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCC 507 

I I I I I I I I II I I I I I I I I I I I I I I II I I II I II I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 181 ATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGAAGTGTGAGGCAGCAGCC 240 

Qy 508 GGT7UVTCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 567 

Ml I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I II I I I I 
Db 241 GGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGA 300 

Qy 568 GACATTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGACTACAGTTCAACAAGGTG 627 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I 
Db 301 GACATTCGCATCAAATATGGC7\ACGGCAGA7\AGAACTCACGACTACAGTTCAAC7\AGGTG 360 



Qy 62 8 AAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACC 687 

I II I I I I I II II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I I I 



Db 


361 


AAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACATCCTGGGGAAGGACACC 


420 


Qy 
Db 


688 
421 


GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATeCTGGTCGGGGCAC 

1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 

GTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC 


747 
480 


Qy 


748 


GCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 
N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTAC 


807 


Db 


481 


540 


Ov 


808 


^1 u/\>\ul./VIjL- i C i 1 CjUAACjI GTCCTGTGGGATACACCGGGGACAGGTGT 

1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 II M 1 

ATCGAGGGCATCAACCAGCTCTCCTGCATy^TGTCCAAATGGATTCTTCGGACAGAGATGT 


867 


Db 


541 


600 


Qy 


868 


CAGCAG 873 

1 1 1 
TTGGAG 606 




Db 


601 





AAV17815; 

17-AUG-1998 (first entry) 
Homo sapiens don-1 gene splice variant. 

Murine; don-1 gene; melanoma; treatment; adenocarcinoma; 
epithelial cell; proliferation; stimulation; treatment; tumours; 
skin; oesophagus; lung; breast; liver; pancreas; colon; prostate; 
gastrointestinal tract; uterus; wound healing; transmembrane; ss. 



RESULT 11 
AAV17815 

ID AAV17815 standard; cDNA; 1476 BP. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 



Homo sapiens . 

Key 
CDS 



WO9807736-A1. 
26-FEB-1998. 

18- AUG-1997; 

19- NOV-1996; 
19-AUG-1996; 



Location/Qualifiers 
69. . 1475 
/*tag= a 

/note= "don-1 polypeptide' 



97WO-US14585. 

96US-0753007. 
96US-0699591. 



(MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 

Busfield SJ, Gearing DP; 

WPI; 1998-169084/15, 
P-PSDB; AAW48382. 



PT Mouse and human don-1 polypeptide (s) - useful for treatment of 

PT melanomas and adenocarcinoma (s ) , and for wound healing 

PS Claim 4; Fig 4; 121pp; English. 
XX 

CC The sequence is that of a human don-1 gene splice variant 

CC Don-1 polypeptides stimulate proliferation of epithelial cells 

CC and thus are implicated in melanomas and adenocarcinomas in which 

CC epithelial cells proliferate out of control. Compounds that 

CC interfere with don-1 mediated cell proliferation can be used 

CC in the treatment of tumours such as melanomas and adenocarcinomas 

CC of the skin, oesophagus, lung, breast, liver, pancreas, 

CC gastrointestinal tract, colon, prostate or uterus. Alternatively, 

CC don-1 polypeptides can be used to stimulate epithelial cell 

CC proliferation, e.g. for wound healing. 



XX 
SQ 



Sequence 1476 BP; 335 A; 475 C; 450 G; 216 T; 0 other; 



Query Match 47.5%; Score 426.2; DB 19; Length 1476; 

Best Local Similarity 89.6%; Fred. No. l.le-89- 

Matches 458; Conservative 0; Mismatches 53; Indels 0; Gaps 0; 

CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG 422 
' I I ' I I I III I I I I I I I I I I I I II 

CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 
CGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 482 

I X I I I < I I I I I N I M I I I I I I I I I I II M I I I I I I I I I II I 11 I I I I M I I I I I 

AGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 217 

ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 
" I I I < I I I I I I I I I I I I I I I I I! 11 I I I I I I I I I 11 II I I II I I! I I II I I I I I I II I I 
ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 



Qy 


363 


Db 


98 


Qy 


423 


Db 


158 


Qy 


483 


Db 


218 


Qy 


543 


Db 


278 


Qy 


603 


Db 


338 


Qy 


663 


Db 


398 


Qy 


723 


Db 


458 


Qy 


7 83 


Db 


518 


Qy 


843 


Db 


578 . 



TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 

I IN I II M I II II II I I M I I I I I I I II I I II I I I I I II I I II II I I I M II I I I I I I I 

TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 



602 
337 



CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 
"> I I I I I I I I M I II II I I I I I II M I I I I I I II II II I II M I II II I I M II I II II 
CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 
" I I I I I I M I I II I I I I I I I M II I I I I I I I I II I I M I I II I I I I I I I I I I I I I I I I I 
CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 
" I I I I N I I I I I I I I I 11 II I I I I I I I I II I I I I II I I I M I I I I I I I I I I I I I I I M I 
CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCC 842 
" I I I M I I I M II M I I I M I I I II I I I I I II II M I I 11 I I M II I I I I I II II I I I 
CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

TGTGGGATACACCGGGGACAGGTGTCAGCAG 873 
II I I I II I II I I I III 



XX 
SQ 



Human neuregulin 2 gene exon 1. 

e^raLT^u^^^ glycoprotein; cytostatic; cancer; tumour; ECD; 

extracellular domain; neuregulin 2; isoform; gene; ds . 



RESULT 12 
ABL40993 

ID ABL40993 standard; DNA; 1054 BP. 
XX 

AC ABL4 0993; 
XX 

DT 03-JUL-2002 (first entry) 
XX 
DE 
XX 
KW 
KW 
XX 

OS Homo sapiens . 
XX 

PN WO200222685-A2. 
XX 

PD 21-MAR-2002. 
XX 

PF ll-SEP-2001; 2001WO-US28548 
XX 

PR ll-SEP-2000; 2 000US-231841P . 
XX 

PA (PCUFE/) KUFE D W. 
PA (OHNO/) OHNO T. 
XX 

PI Kufe DW, Ohno T; 
XX 

DR WPI; 2002-339864/37. 
XX 



II llLf t glycoprotein (MUCl) extracellular domain antagonist for 

PT manufacturing a medicant that inhibits the proliferation of Lc-l 

It g^^Jtr-'" ^'"^'^ ^^''^ ^^-^ ^ reducftuior 

XX 

PS Disclosure; Page 61-62; 74pp; Englis-h. 

XX 

CC The invention relates to the use of a MUCl (mucin glycoprotein) 

cc r ^ ^^''''^ antagonist for the manuLcture of a medicant 

CC to xnhxbit the proliferation of MUC-1 expressing cancer ce^ls Zcilcl 

CC antagonists (optionally combined with a pharmaceutical c;rrie;,^an be 

CC uf^rrr'^ '""^'"^"^ proliferation of MUCl-expressing cancer cSls 

CC Tsllr- I """""" P-°-tate cancer and leukemi; 

CC :^T^^J:^- ll^r^t alSJlatinr^-l with administration 

CC z^'^r --er;^;^pe:Lii ^jf ieLL^^? ;o r;;:wtJ!%\:^^^ 

CC antSon?^?" screening to identify MUCl^CD 

CC antagonists. The present sequence represents an exon fragment of the 
CC human neuregulin 2 gene. y"i<=iiL oi. cne 

Sequence 1054 BP; 178 A; 367 C; 361 G; 148 T; 0 other; 

TeTJlTr-^.T.- score 424; DB24; Length 1054; 

Best Local Similarity 100.0%; Pred. No. 3.4e-89; 

Matches 424; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTciTCGiiGiiiiiiiiciiiiii ^48 



Db 589 
Qy 61 



Db 64 9 



''''''''' 'I I N I I I I I I I I I I I I I I I I I I I I I I I I I , I ,,,, ^ 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGiAi^GGciiicGiGiiii^ 708 



121 ^^^^^^y^^^^^^^CTGGTCCCAGCC^ 

''''''''''''''''' I I N I M I I I I I I I I I I f I I I I I I I I I M 
GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGiiiiiiii^iiiiiiii^^ 768 



Db 709 
Qy 181 



Db 769 



CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGXGiGGCCoiiiic^^Jic^C^i^ 828 



Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGC 



G 300 



0. a. oJslilillii^l^^^L^^^ 3„ 



Qy 361 420 



Qy 421 TGCG 424 

Mil 

Db 1009 TGCG 1012 



RESULT 13 
AAV17813 

ID AAV17813 standard; cDNA; 1607 BP 
XX 

AC AAV17813; 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 

OS Mus mus cuius 
XX 



17-AUG-1998 (first entry) 
Mus musculus don-1 gene splice variant. 

^^^^r;-'^?'''\r''^'' melanoma; treatment; adenocarcinoma; 
epxthel:Lal cell; proliferation; stimulation; treatment^ tumours • 
skrn; oesophagus; lung; breast; liver; pancreas; colon' p^stat^ 
gastrorntestinal tract; uterus; wound healing; ecrete^ pr^telnj'ss 



™ Location/Qualifiers 
FT CDS 79.. 624 



FT 
FT 
XX 



/*tag= a 

/note= "secreted don-1 polypeptide" 



PN WO9807736-A1. 



XX 

PD 26-FEB-1998. 
XX 

PF 18-AUG-1997; 97WO-US14585 . 
XX 

PR 19-NOV-1996; 96US-0753007 . 

PR 19-AUG-1996; 96US-0699591 . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC 
XX 

PI Busfield SJ, Gearing DP; 
XX 

DR WPI; 1998-169084/15. 

DR P-PSDB; AAW48380. 
XX 

PT Mouse and human don-1 polypeptide (s ) - useful for treatment of 
melanomas and adenocarcinoma (s ) , and for wound healing 



PT 
XX 

PS Claim 4; Fig 2; 121pp; English 
XX 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 

CC proliferation, e.g. for wound healina 
XX 
SQ 



The sequence is that of a murine don-1 gene splice variant 
Don-1 polypeptides stimulate proliferation of epithelial cells 
and thus are implicated in melanomas and adenocarcinomas in which 
epithelial cells proliferate out of control. Compounds that 
interfere with don-1 mediated cell proliferation can be used 
in the treatment of tumours such as melanomas and adenocarcinomas 
of the skin, oesophagus, lung, breast, liver, pancreas 
gastrointestinal tract, colon, prostate or uterus . Alternatively 
don-1 polypeptides can be used to stimulate epithelial cell 



Sequence 1607 BP; 365 A; 500 C; 480 G; 262 T; 0 other; 

Query Match 45.2%; Score 405.4; DB19; Length 1607; 

Best Local Similarity 87.9%; Pred. No. 8 2e-85- 

Matches 442; Conservative 0; Mismatches 61; mdels 0; Gaps 0 

CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 4 3 0 
IN I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I , I I I I I , 

CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 6 1 

GGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGA 4 90 
" I > I I I I I I I I I I I I I I I I I I I I I I I I I I III I I II II II I I I I I I II I I I II I 

GGCCCAAGCTGAAGAAGATGAAGAGCCAGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCA 121 

AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 
I I I I I I I I M I I I II II II I I I I I I I I I I I I I II I I II I I I II I I II I I II I i 

AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

AGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGAC 610 
' I >> I I I II II II II I I I I I I I I I I I II I I M I I I I II I I II M I I I I I I I 

AACTCAACCGGAGTCGTGATATTCGCATCAAGTATGGCAATGTCAGAAAGAACTCACGGC 2 4 1 

TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 67 0 
" " I N I I I II I I I I I I I I I I I I I I II I I I I I I II I I II I I I I I I I I I I I I I I 

TACAGTTCAACAAAGTGAGGGTGGAGGATGCCGGGGAGTACGTCTGTGAGGCCGAGAACA 3 0 1 



Qy 


371 


Db 


2 


Qy 


431 


Db 


62 


Qy 


491 


Db 


122 


Qy 


551 


Db 


182 


Qy 


611 


Db 


242 



Qy 


671 


Db 


302 


Qy 


731 


Db 


362 


Qy 


791 


Db 


422 


Qy 


851 


Db 


482 



TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTA^ 730 

" " I I I I I I I I I M M I M I I I I I II I II I I I I M I I I II I II I M I II I I 

TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACC^^ 361 
CATCCTGGTCGGGGCACGC 

II I INN Mill II IIIMIIMIIIII Mill M MM Mill M M MM 

CATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTGTGTgJJ.TC 421 
GAGGCGTCTGCTACTACATCG 

NMIIIIIIMIIIIIMMIIIIIMMIIIIIIM Mill |||| 

GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAAC 481 

ACACCGGGGACAGGTGTCAGCAG 873 
I Ml I II M I I II 



RESULT 14 
AAV17812 

ID AAV17812 standard; cDNA; 2467 BP 
XX 

AC AAV17812; 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 

OS Mus musculus 



17-AUG-1998 (first entry) 
Mus musculus don-1 gene splice variant. 

Murine; don- 1 gene; melanoma; treatment; adenocarcinoma; 
epithelral cell; proliferation; stimulation; treatment; tumours; 
sk.n; oesophagus; lung; breast; liver; pancreas; colon; prostat;; 
gastrointestinal tract; uterus; wound healing; transmembrane; ss 



XX 

™ Location/Qualifiers 
FT CDS 79.. 1896 

FT /*tag= a 

/note= "transmembrane don-1 polypeptide' 

PN WO9807736-A1. 
XX 

PD 26-FEB-1998. . 
XX 

PF 18-AUG-1997; 97WO-US14585 . 
XX 

PR 19-NOV-1996; 96US-0753007 . 
PR 19-AUG-1996; 96US-0699591 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
XX 

PI Busfield SJ, Gearing DP; 
XX 

DR WPI; 1998-169084/15. 
DR P-PSDB; AAW48379. 
XX 
PT 
PT 



Mouse and human don-1 polypeptide ( s ) - useful for treatment of 
melanomas and adenocarcinoma ( s ) , and for wound healing 



XX 
PS 
XX 



Claim 4; Fig 1; 121pp; English. 

CC The sequence is that of a murine don-1 gene splice variant, 
cc ?nH"Jh^^^^^^^''f^ stimulate proliferation of epithelial cells 

ep^thSLr^.tr melanomas and adenocarcinomas in which 

epithelial cells proliferate out of control. Compounds that 



CC 
CC 
CC 
CC 
CC 
CC 
CC 
XX 
SQ 



. ^ ^ . , ^ ^ v-'-'iioi.wj. . '^ompounas that 

interfere with don-1 mediated cell proliferation can be used 

of Se s^?f °f """"^'f' '"""^ melanomas and adenocarcinomas 
of the skm, oesophagus, lung, breast, liver, pancreas 
gastrointestinal tract, colon, prostate or uterus. Alternatively 
don-1 polypeptides can be used to stimulate epithelial cell 
proliferation, e.g. for wound healing. 

Sequence 2467 BP; 592 A; 752 C; 706 G; 417 T; 0 other; 



Query Match 44.8%; Score 402.2; DB19; Length 2467- 

Best Local Similarity 87.5%; Pred. No. 5.1e-84; 

Matches 440; Conservative 0; Mismatches 63; Indels 0; Gaps 0 

'r^tt????ttt^'^'''"'''^''^'^^^^'^''''^^^^^^^^TGTGCACTGACTGCGCCACCC 430 
I ' I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I , I I 11 I I , I I I , 

CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCAC^ 61 

?????tf?^T??t?^?''^'''^^'''''^''''''^'=''^^^^^^^TGGGTGAGAAGCAATCGCTGA 490 
""I" I lllllllllllillllllllll III I I I I I II I I 11 I I I II Mill I 

GGCCCAAGCTGAAGAAGATGAAGAGCCAGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCA 121 
t?I?^?t???t??''??''''''^'^'''^''''''''^''''''^^^'^^^^^^'^'=S'^TCAAGGATGGCAAGG 550 

" X II M I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I 

AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTcAaGGATGGCAAGG 181 

AGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG^^ 610 
I II Hill I II H 11 II I IN III II I I II I II I I I I I I I I I I I I I II I I I 

aactcaaccggagtcgtgatattcgcatcaagtatggcaatgtcagaJ^gAactcacggc 2 4 1 

mt?M?tt?tf^?T?'^''^''^^''^^''''''''''''^^^^^^^^^TCTGCGAGGCCGAGAACA 670 

""""Hill I I I I M I I I I I I II I I I I I M I I I I I I I I I I I I 11 I 11 I I 

TACAGTTCAACAAAGTGAGCGTGGAGGATGCCGGGGAGTACGTCTGTGAGGCCGAGAACA 301 

y??r????f??t?^???T'^''^''^^''^^°''''^^^^^^^^CAGC'^TGAGCACCACCCTGT 730 
"II I I N I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGACCACCACTCTGT 361 



Qy 


371 


Db 


2 


Qy 


431 


Db 


62 


Qy 


491 


Db 


122 


Qy 


551 


Db 


182 


Qy 


611 


Db 


242 


Qy 


671 


Db 


302 


Qy ■ 


731 


Db 


362 1 


Qy 


791 ( 


Db 


422 ( 


Qy 


851 ; 


Db 


482 -] 



?tM?T??M???^''''''''^^'^^^'^^^^^^^^^°^^^^CC^eTCCTATTGCGTCAATG 790 
"""""III II M I I I I I I II I I I I I I I I I I I I I I I I I I I I II II I II I 

CATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTGTGTGAATG 421 



I?Mt?Tt?^T''''''°^^''''^'''^^^^°^^^^^^TGCAAGTGTCCTGTGGGAT 850 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I 

GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 4 8 1 



I I 



RESULT 15 
ABS56034 

ID ABS56034 standard; cDNA; 1561 BP. 
XX 

AC ABS56034; 
XX 



DT 14-JAN-2003 (first entry) 

CDNA encoding murine secreted splice variant of Don-1. 

Murine; Don-1; epidermal growth factor- rrir. 
glycoprotein ligand; cell prolifeS?o;- ^el'l 

carcinoma; adenocarcinoma lell^'lyToZ'; cell Sj'f^r'e^tlitLn'^^^"^^ 

^r^^'^L^'^^^' — ^ui"rf;:;^at?;n; 

xnerary, cytostatic; gene therapy; chromosome 18; gene; ss. 



XX 
DE 
XX 
PCW 
KW 
KW 
KW 
KW 
XX 

OS Mus sp. 
XX 



™ Location/Qualifiers 
C:DS 78.. 623 



FT 
FT 
XX 

PN US2002127594-A1. 
XX 

PD 12-SEP-2002. 
XX 

PF 12-MAR-2002; 2002US-0096241 . 
XX 

PR 22-JUN-2000; 2000US-0599789 
XX 

PA (GEAR/) GETTING D P. 
PA (BUSF/) BUS FIELD S J 
XX 

PI Gearing DP, Busfield SJ; 
XX 

DR WPI; 2003-039584/03. 
DR P-PSDB; ABG71637. 
XX 
PT 
PT 
PT 
XX 



/*tag= a 

/product^ "Secreted splice variant of Don-1" 



LTLen^:^y?ni~^^^ of cells, 

tumour formation and progression In braS - ' '"^ regulating 

PS Claim 4; Fig 2; 66pp; English. 

CC related to .U^tnS^ lTotlt. ialor,"TE^TL:f. """" 

s iz-i ?:j--r:L-i„'L-°?r"^" --- -- :u"."r^;rK™.„ 
CC j^ti^^aieft? -n-i^\;;;eitijr,re'r ?:rLrs^^f:t"'rSo°^' 

cc Sa „:™;; 'c.l -prouLLlr^vrl'^r/" "h""^ -e»Lnnd 

rr rM^^T ^- P^o-Lirerative dxsorders and play a role in i-h^ 

CC d.?f:r:ntx;tLn ""^'^'T"^^ adenocarcinomaf myJJoL^ in Lll 

CC ti^^^^' --^-1- polypeptides are 

lux ror inhibiting proliferation of adenocarcinoma cells. 



CC wxth Don-1, and for regelating tSo^ffoLrj "'"' "^^^ interact 
CC the brain. The Polynucleotide seauencef i°'' ^""^ Progression in 
CC m gene therapy. The present sjguence en -3ed 
CC splice variant of Don-1. ^^q^ence encodes murine secreted 

XX 

3« sequence X5ez aP, 3S1 ^, c; .es o, 2,e T, 0 ,th„, 

Query Match 43 ga. o 

Best Local Similarity li ,1'. f ^'^^ 25; Length 1561; 

Matches 441; Conservative a t ' 5.1e-82; 

0, Mismatches 61; mdels i- car^s 
Qy 371 CrAArr-nn...,, ' ^^^^ 



851 ACACCGGGGACAGGTGTCAGCAG 873 
Db TCTTCGGACAGAGATGTTTGGAG 503 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score; 
Sequence : 

Scoring table: 



January 14, 2004, 07:16:01 ; Search time 66.4093 Seconds 

(without alignments) 
5961.825 Million cell updates/sec 

US-09-864-675-3 
897 

1 atgaggcgcgacccggcccc caatggtcaacttctcctaa 897 

IDENTITY__NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 569978 seqs, 220691566 residues 

Total number of hits satisfying chosen parameters: 1139956 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_NA: * 

1 : /cgn2_6/ptodata/2/ina/5A_COMB. seq: * 

2 : / cgn2_6/ptodata/2/ina/5B_COMB. seq: * 

3 : / cgn2_6/ptodata/2/ina/6A_COMB. seq: * 

4 : /cgn2_6/ptodata/2/ina/6B__COMB. seq: + 

5 : / cgn2_6/ptodata/2/ina/PCTUS_COMB. seq: * 
6: /cgn2_6/ptodata/2/ina/backfilesl.seqi* 

Pred. No. is the number of results predicted by chance to have a 

Jnd ?sT h'^" ^^^^^ being printed 

and xs derxved by analysis of the total score distribution. 
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1/ Appli 
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1 


us- 


08-036 


-555B-21 


Sequence 


21, Appl 


15 


84 


9 . 


4 


2003 


1 


us- 


08-469 


-569-21 


Sequence 


21, Appl 


16 


84 


9 . 


4 


2003 


1 


us- 


08-249 


-322A-21 


Sequence 


21, Appl 


17 


84 


9 . 


4 


2003 


1 


us- 


08-469 


-526A-21 


Sequence 


2 1 , Appl 


18 


84 


9 , 


4 


2003 


2 


us- 


08-734 


-591A-21 


Sequence 


2 1 , Appl 


19 


84 


9 . 


4 


2003 


2 


us- 


08-469 


-660-21 


Sequence 


21, Appl 


20 


84 


9 . 


4 


2003 


3 


us- 


08-341 


-018-71 


Sequence 


71, Appl 


21 


84 


9 . 


4 


2003 


3 


us- 


08-470 


-335-21 


Sequence 


21, Appl 


22 


84 


9. 


4 


2003 


3 


us- 


08-735 


-021-21 


Sequence 


21, Appl 


23 


84 


9 . 


4 


2003 


3 


us- 


08-734 


-664A-21 


Sequence 


21, Appl 


24 


84 


9 . 


4 


2003 


3 


us- 


08-470 


-339-21 


Sequence 


21,^ Appl 


25 


84 


9 . 


4 


2003 


4 


us- 


08-467 


-602-21 


Sequence 


21, Appl 


26 


84 


9 . 


4 


2003 


5 


PCT 


-US94- 


05083C-21 


Sequence 


21, Appl 


27 


84 


9. 


4 


2003 


5 


PCT 


-US95- 


06846A-21 


Sequence 


21, Appl 


28 


83.4 


9. 


3 


1108 


1 


us- 


08-036 


-555B-135 


Sequence 


135, App 


29 


83,4 


9 , 


3 


1108 


1 


us- 


08-469 


-569-135 


Sequence 


135, App 


30 


83.4 


9. 


3 


1108 


1 


us- 


08-249 


-322A-135 


Sequence 


135, App 


31 


83.4 


9. 


3 


1108 


1 


us- 


08-469 


-526A-135 


Sequence 


135, App 


32 


83.4 


9. 


3 


1108 


2 


us- 


08-734 


-591A-135 


Sequence 


135, App 


33 


83.4 


9 . 


3 


1108 


2 


us- 


08-469 


-660-135 


Sequence 


135, App 


34 


83.4 


9. 


3 


1108 


3 


us- 


08-341 


-018-5 


Sequence 


5, Appli 


35 


83.4 


9. 


3 


1108 


3 


us- 


08-470 


-335-135 


Sequence 


135, App 


36 


83.4 


9 . 


3 


1108 


3 


us- 


08-735 


-021-135 


Sequence 


135, App 


37 


83.4 


9. 


3 


1108 


3 


us- 


08-734 


-664A-135 


Sequence 


135, App 


38 


83.4 


9. 


3 


1108 


3 


us- 


08-470 


-339-135 


Sequence 


135, App 


39 


83. 4 


9 . 


3 


1108 


4 


us- 


08-467 


-602-135 


Sequence 


135, App 


40 


83.4 


9. 


3 


1108 


5 


PCT 


-US94- 


05083C-131 


Sequence 


131, App 


41 


83.4 


9. 


3 


1108 


5 


PCT 


-US95- 


06846A-135 


Sequence 


135, App 


42 


78.4 


8 . 


7 


1193 


1 


us- 


08-469 


-526A-134 


Sequence 


134, App 


43 


78.4 


8. 


7 


1193 


2 


us- 


08-734 


-591A-134 


Sequence 


134, App 


44 


78.4 


8. 


7 


1193 


3 


us- 


08-341 


-018-3 


Sequence 


3, Appli 


45 


78.4 


8 . 


7 


1193 


3 


us- 


08-470 


-335-134 


Sequence 


134, App 



ALIGNMENTS 



RESULT 1 

US-08-753-007A-5 

; Sequence 5, Application US/08753007A 
; Patent No. 6074841 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 
; APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVFINTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
; CITY: Boston 

; STATE: MA 

COUNTRY: US 
ZIP: 02110-2804 
; COMPUTER READABLE FORM: 



; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
S0FTW7VRE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/753, 007A 

; FILING DATE: 19-NOV-1996 

; CLASSIFICATION: 536 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

; REFERENCE/ DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
; TELEFAX: 617-542-8906 

TELEX: 

; INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1884 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS: single 

; TOPOLOGY: linear 

; MOLECULE TYPE: cDNA 

; FEATURE: 

; NAME/KEY: Coding Sequence 

LOCATION: 664... 18 83 

OTHER INFORMATION: 
US-08-753-007A-5 

Query Match 93.1%; Score 835.4; DB 3; Length 1884; 

Best Local Similarity 98.1%; Pred. No. 8.6e-200; 

Matches 856; Conservative 0; Mismatches 16; Indels 1; Gaps 1 

Qy 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I M I I I I I I M I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 




Db 



278 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 



Qy 



121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 



Db 




Qy 



Db 



181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I 
398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 



Qy 



241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 




Db 



458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 



Qy 



301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 



360 



Db 518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

Qy 361 CCCCTCGATACCAACGGCAAA7\ATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 420 

Mill I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I 

Db 578 CCCCT-GATACCT^ACGGCTiAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 636 

Qy 421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGTyVG 480 

I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I 
Db 637 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 696 

Qy 481 CAATCGCTG7^GTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTC7\AG 54 0 

I I M I I I I M I II I I I II I I I I I I I I II I I I I II II I II I I I I II II II I I I I II I II I I 
Db 697 CAATCGCTGTyVGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG7WVG 600 

Mill I I M I I I II II I I I II I I I I I II I I I I I I II II I I I I I I I I I I I I I I I I I 

Db 757 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 816 

Qy 601 AACTCACGACTACAGTTCAACAAGGTG7\AGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I M II II I I II II II I II I I II I I I II I I I M I I II I I II I I I I I I II I I II II I II I I I 
Db 817 AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 87 6 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC7VACAGCGTGAGC 72 0 

I II M I I II I I I I I I I I I I I I I II I I I I I II I I I I I II I I I I I I I I I I I I I I I I I 

Db 877 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 

M I I I I II I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 937 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCJmGTCCTAT 996 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

I M M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II II I I I III I I I I I I III 

Db 1057 CCAAATGGATTCTTCGGACAGAGATGTTTGGAG 1089 



RESULT 2 
US-09-398-496-5 

; Sequence 5, Application US/09398496 

; Patent No. 6133423 

; GENERAL INFORMATION: 

; APPLICANT: Gearing, David P. 

APPLICANT: Busfield, Samantha J. 

TITLE OF INVENTION: DON-1 GENE T^^D POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
; NUMBER OF SEQUENCES: 33 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 
; CITY: Boston 

STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 



COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398 , 496 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: . 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
N7\ME: Fasse^ J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022 001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-507 0 
TELEFAX: 617-542-8906 
TELEX : 

INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1884 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

N7\ME/KEY: Coding Sequence 
LOCATION: 664... 1883 
OTHER INFORMATION: 
US-09-398-496-5 

Query Match _ 93.1%; Score 835.4; DB 3; Length 1884; 

Best Local Sxmilarity 98.1%; Pred. No. 8.6e-200; 

Matches 856; Conservative 0; Mismatches 16; Indels 1; Gaps 1 
ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

Ulllllll'IIIIIIIIMIIIlliiiiMMIIIMIllliMIIMMIIMIIIII 

ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 



Qy 


1 


Db 


218 


Qy 


61 


Db^ 


278 


Qy 


121 


Db 


338 


Qy 


181 


Db 


398 


Qy 


241 



120 

337 



TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 
" " " > > I < I I I I I I I I I I I I I I I I I I I I I I I I I I I I I , , , , I I , , I , , I I , , I , , , , , 

TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 1 8 0 
I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II M , I I I ,, I , 

GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 
"""< I I I > I I I I I I I I I I I II I I I I I I I II M I I I I M I M I I I M I I I I I I I I I I I 
CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 
>"<> I << I I <' I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I M I I I I I I I II I I I I I 



Db 


458 


Qy 


301 


Db 


518 


Qy 


361 


Db 


578 


Qy 


421 


Db 


637 


Qy 


481 


Db 


697 


Qy 


541 


Db 


757 


Qy 


601 


Db 


817 


Qy 


661 


Db 


877 


Qy 


721 . 


Db 


937 . 


Qy 


781 ' 


Db 


997 ' 


Qy 


841 C 


Db 


1057 ( 



GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 
CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

"'"lOKIIIIIIIIIIlliiiMMIIIIIllliiiiiiiiiiiiiiiiiiiiii, 

CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 
CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 420 

I 0 I I I I M I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 

CCCCT-GATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 636 

TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 4 8 0 
'""<> I >> I I I I I I M I I I I I I I I I I I I I I I I I I I I I I ,,, I ,,,,,, I ,,,,,, I , , 

TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 696 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 540 
" I I I I < I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I , I , , , I , , I , , I , , , I , , , , , , 

CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 



600 



GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 
" < " " I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I , , I , , , I , , , , , I , , , 

GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 8 1 6 



AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 
'l><l<<llllllllllllllll|||||||||||||||||||{{|,,|,|,,,||,,|,, 

AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 876 
GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

"KI'IIIIIIIIIIIMIIlliiiiMiiiMiMlllliiiiiiMIIIIMIllll 

GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 7 8 0 
" I > I I > I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I ,, I , , 

ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 9 9 6 

TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 8 4 0 
" ' I I I > I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I , I , , , I , , I I , , , I , , , , , , , 

TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

::CTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 
II I I I I I I I I I I I I I I III 



RESULT 3 

US-08-525-864A-1 

Sequence 1, Application US/08525864A 
Patent No. 5912326 
GENERAL INFORMATION: 

APPLICANT: Chang, Han 

TITLE OF INVENTION: Cerebellum-derived Growth Factors, and Uses 
TITLE OF INA^NTION: Related thereto 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: LAHIVE & COCKFIELD 
STREET: 28 State Street 
. CITY: Boston 
STATE: Massachusetts 



COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/525, 8 64A 
FILING DATE: 8-SEP-1995 
CLASSIFICATION: 530 
; ATTORNEY/AGENT INFORMATION: 

NAME: Kara, Catherine J. 
REGISTRATION NUMBER: 41,106 
; REFERENCE/DOCKET NUMBER: ' HUI-017 

TELECOMMUNICATION INFORMATION- 
TELEPHONE: (617)227-74 00 
TELEFAX: (617)742-4214 
; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 3441 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS: double 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME /KEY: CDS 
LOCATION: 180.. 2441 
US-08-525-864A-1 



Matches 826.. Conservative 0. Mls^tches Vo': Indels 0,- Gaps 0 
CGCTAGATC™ 

iGc™c;™!!iii.li iiii Li Li I I I n I ^ IMNMI IMIIlin 



Qy 


1 


Db 


180 


Qy 


61 


Db 


240 


Qy 


121 


Db 


300 


Qy 


181 


Db 


360 


Qy 


241 


Db 


420 


Qy 


301 


Db 


480 



Qy 

Db 

Qy 421 



Db 

Qy 481 



361 ^^^■[•^^TACCMCGGC;^ 



rCGTTCAAG 719 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



841 =^}=y==«™«CCGGGGA^^ 



RESULT 4 

US-08-525-864A-3 

Sequence 3, Application US/08525864A 
Patent No. 5912326 
GENERAL INFORMATION: 

APPLICANT: Chang, Han 

NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: LAHIVE & COCKFIELD 
STREET: 28 State Street 
CITY: Boston 
STATE: Massachusetts 
COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 



CURRENT APPLICATION DATA- 

CLASSIFICATION; 530 
ATTORNEY/AGENT INFORMATION- 
NAME: Kara, Catherine J.' 
REGISTRATION NUMBER: 41 106 
REFERENCE/DOCKET NUMBER- HUI 017 
TELECOMMUNICATION INFORMATION 
TELEPHONE: (617)227-7400 
TELEFAX: (617)742-4214 



INFORMATION FOR SEQ ID NO- 
SEQUENCE CHARACTERISTICS- 
LENGTH: 993 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: CDS 
LOCATION: 1 ggn 
US-08-525-864A-3 



3: 



Query Match ^„ 

Best Local Similarity 90 4%' tZ'/l^^'^' ^^2; Length 993 - 

"""" c„„3„„,..„,"- „r'- ^^^^^^ ■ ^ 



Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 



60 
'C 60 
120 
120 



llllll " " " " "II II, , TnnTn^^^^ 180 



n n[n n m^rrSm nfM?f?[f=— ' 

3- ™— I 



Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 ( 


Qy 


. 601 ; 



' " " " ' nnin Willi 

C5GTGGGCGAGAAG 480 



Db 



Qy 


721 


Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 



tgtgtgaatggaggcgtgtg^tI^! 1 111 "INI,,,,, 



-i>-rucTGCAAGTGT 84 0 
" N I I I I I I ,1, 
:TCTCCTGCAAATGT 84 O 



I I 



RESULT 5 

US-08-753-007A-7 

GENERAL INFORMATION- 

TITLE J^'lMVP^M^f^.^^^' f--tha J. 



TITLE OF iMVENJloJ.^'oorr'^^ 

TITLE OF INVENTION:* ^D^^spf 1"^° POLYPEPTIDES 
NUMBER OF SEQUENCES- ^? ^ THEREFOR 

CORRESPONDENCE ADDRESS - 
ADDRESSKP. T^. _, . 



-f^UKESS: 

ADDRESSEE; Fish x: d • v 
STREET: 225 F^^ R^<=hardson P.c. 
CITY- \ Z ^""^Im Street 
>--LiY. Boston 

STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM- 
MEDIUM TYPE: Diskette 
COMPUTER: IBM CoiatLle 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSFO • 
CURRENT APPLICA?i'oToS^f^- 

APPLICATION NUMBER: US/08/753, 007A 



FILING DATE: 19-N0V-1 qq^ 
CLASSIFICATION: 53e 
PRIOR APPLICATION DATA- 

?K?^:~;-/f^^^ 

"'Sn^-^ --Son" 
NAME. Fasse, j. peter 

REGISTRATION NUMBER: 32 983 

MTOKMATIOB FOR SEQ ID NO- , 

LENGTH: 14 7fi u^„ 
TYPF-. . pairs 

TYPE: nucleic acid 

STRANDEDNESS: sinqle 
TOPOLOGY: linear 

O8-753-O07A-7 
Query Match 

Mismatches 53. t ^ 



Db 



!63 CCTCGATACCAACrr-r... ' ^""^^^^ ^^P^ 0 

483 ATCGCTGAAGTGTGAGGCAGC.. ^^^^^^^^^AGGTGGGTGAGAAGCA 217 

543 TGGCAAGGAGCTCAACCGCArrn. ^^^^^^CCGTTGGTTCAAGGA 277 

603 CTCACGACTACAGTTCAArn. "^^^^^"^^AACGGCAGAAAGAA 337 

663 CGAGAACATCCTGGrrn... '''^''^^^^^GAGTATGTCTGCGAGGC 397 

723 CACCCTGTCATCCTGGTrrr. ^^^^^^^^CAACAGCCTGAGCAC 457 



Qy 

Db 

Qy 

Db 



^CTTCGGACAGAGATGTTTGGAG 608 



RESULT 6 

Sequence 7 nr^^^ • 

Patent No .' 6^423 "^/09398496 
GENERAL INFORMATION- 
APPLICANT- 

APPLICANT- Bu^f^"?: ^^^^^ P- 
TITLE OF iNVEN?T^M ' ^^">^"tha J. 

STREET: 225 it. f , ^^^^ardson P.c. 

Bosto/ '^"-t 
STATE: MA 

COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM- 
MEDIUM TYPE: Diskf^; 
COMPUTER: IBM A ^ 
OPERATING sSTEI^r'^^o'f 
SOFTWARE: FastSFo ^r 
CURRENT APPLlCA?iON°D^^f^°" ''^ 

Sg^^I-^-- -/09/398,49. 
CLASS I PI CATION - 
PRIOR APPLICATION DATA- 
APPLICATION NUMBER r... 
FILING DATE- IP M 08/753,007 

REGISTRATION NUMBER 
REFERENCE/ nnrR-n^^ 32,983 

INFORMATION FOR qpn xr. 

nucleic aclH 

STRANDEDNESS: sSal 

TOPOLOGY: lineal 
MOLECULE TYPE: cD^A 
FEATURE: ^ 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



Query Match 

Best Local SiInilar-i^w tl'^^' Score 427.8- nn o 

Mismatches 59. r ^ . 

Indels 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



^-ix^inacches 59. t ^ -, 

483 ATCGCTGAAGTGTGAGGCArr... '"^^^^^^^AGGTGGGTGAGAAGCA 217 

543 TGGCAAGGAGCTCAACCGCArr... ^^^^^^CGTTGGTTCAAGGA 277 

603 CTCACGACTACAGTTCAAr^.. "^^^^^^CAACGGCAGAAAGAA 337 

663 CGAGAACATCCTGGrr;.... ''"^^^'^^AGTATGTCTGCGAGGC 397 

723 CACCCTGTCATCCTGGTrrrr. "^^^^^^^^CAGCGTGAGCAC 457 

783 CGTCAATGGAGGCGTCTGCTACT... "^^^^^ACAGCCAAGTCCTATTG 517 

- ===HSH==- ^^^^^ - 



RESULT 7 

US^08-753-007A-31 
Sequence 31 Anr.i -; 

Patent No. 6O7K1 " "^/08753007A 

GENERAL INFORMATION- 



JITLE OP iNVENTION-'-Dorr^"^ 
TITLE OF INVENTION- AMn~, ^^"^^ POLYPEPTIDES 
NUMBER OF SEQUEnJes • ^? ""'"^^ THEREFOR 
CORRESPONDENCE ADDRESS: 



ADDRESSEE: Fish ^ o- u 
STREET: 225 ^ R^^hardson P.c. 

CITY: Boston """"^ 
STATE: ma 
COUNTRY: US 

02110-2804 
COMPUTER READABLE FORM- 
MEDIUM TYPE: Disi^t, 
COMPUTER- IBM 
OPERATING sSTEMr^^^o's"'^ 
SOFTWARE: FastSEO v ■ 
CURRENT APPLIci?SN^,I;^f^°" '"^ 
APPLICATION NUMBER n; /no, 
FILING DATE: T^NOV-lSs'^'^"' 
CLASSIFICATION: 53fi 
PHIOR APPLICATION DATA- 

^pEY/_ 

REGISTRATION NUMBeJ 32 983 

INFORMATION FOR SEQ ID NO- 
SEQUENCE CHARACTERISTICS- 
JJ^Gth: 2268 base pa'L 
J;^PE. nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: lineal 
MOLECULE TYPE: cDNA 
FEATURE: 



Query Match 

Mismatches 52- j d 1 

ATCGCTGAAGTGTGAGGCACr.. "^^^^^^^^^^SGTGGGTGAGAAGCA 217 

rCGCAAGGAGCTCAACCrrav.. ^^^^^^TACCGTTGGTTCAAGGA 277 



Qy 


363 


Db 


98 


Qy 


423 


Db 


158 


Qy 


483 . 


Db 


218 j 


Qy 


543 1 



Db 

Qy 

Db 



Qy 


663 


Db 


398 


Qy 


723 


Db 


458 


Qy 


783 1 



2'8TGGc«<,«5CTC«0CGCA<.CC«SACATTC.r., 



CACCCTGTCATCCTGGTrrr.^. ^"^^^^^^^CAGCGTGAGCAC 



Qy 

Db 



RESULT 8 

US-09-398-496-31 
Sequence 31 a^^i • 

Patent No. eiJfS"^^^"" US/09398496 
GENERAL INFORMATION- 
APPLICANT- r» 

APPLICAN^;. °-id P. 



APPLICANT-. S^^"?: 
TITLE OF INVEnSn 'n^"™^"^^^ J- 



Boston 
STATE; lyy^ 

COUNTRY: US 

ZIP: 02110-2804 



COMPUTER READABLE FORM- 
MEDIUM TYPE- f^. ,^• 
COMPUTERr isM ^ 
OPERATING s2TH^r""^Sf 
SOFTWARE: FastSPo 
CURRENT APPLlC^TIoTD^^f'^" 
APPLICATION NUMBER / 
FILING DATE: US/09/398, 4 96 

CLASSIFICATION - 
PRIOR APPLICATION DATA- 

O8/.5 

REGISTRATION'NUMBEjf'^2,983 



TELEX: 

INFORMATION FOR SFn rn » 

SEQUENCE chaSacteVs4°s. = 

t\T"; ^f^« Pai^s 

J-ii-t. nucleic acid 
STRANDEDNESS: sSaJ. 
TOPOLOGY: lineal 
MOLECULE TYPE t 
FEATURE: ' 
NAME/KEY- • 

LOCATION,: ef'lolt'^""' 

Query Match 

S.%\S=%\|---v^^^£e'l: I"-?:,,- f- ..n,.. ,,,, 

Mismatches 50. ^ ^ 

^23 CGCCACCCGGCCCAAGTTOAA... ''^''"^^^^^^^^^^CCCGGGGAGAA 157 

483 ATCGCTGAAGTGTGAGGCAGr.. ''°''^^^'=^^^<^GTGGGTGAGAAGc;; 217 

543 TGGCAAGGAGCTCAACCrr.r. '"'^'^^^^^^C^^TTGGTTCAAGGa 277 

^03 CTCACGACTACAGTTCAACAArr. ""^^'^^^^^^C^CGGCAGA^^^ 337 

723 CACCCTGTCATCCTGGTCGrrrr^. ' "^'"'^^^^^GCGTGAGCAC 457 

783 CGTCAATGGAGGCGTCTGCTACT.. ^""^"^^^^^^^^CAAGTCCTATiG 517 



RESULT 9 

US-08-753-007A-3 

GENERAL INFORMATION- 

TITLE OF iNvlNTION 'n'"'"^"'^^ 

TITLE OF INVENTION ^ru.S^'J ^POLYPEPTIDES 
NUMBER OF SEQUENCES- ^ THEREFOR 

CORRESPONDENCE ADDRESS 

ADDRESSEE: Fish A o,' v, 

STREET: 225 Fr.n^ ^'•C- 

CITY: Boston " '"""^^ 

STATE: ma 

COUNTRY; US 

02110-2804 

SOFTWARE: FastSEO v ■ 
CURRENT APPLICATION°dSa-'°" ' 

APPLICATION NUMBER: US/08/7., . 

FILING DATE: 19-NOV-1996^'"' 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA- 

?SL"s?E^..~;-/.^ 

REGISTRATION NUMBER- 32 qo^. 



3: 



INFORMATION FOR SEQ ID NO- 
SEQUENCE CHARACTERISTICS. 
LENGTH: 1607 base 
^YPE. nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

Query Match 

^0; Mismatches 61- Tr>^ 1 
Qv ' Indels n- r- 



851 ACACCGGGGACAGGTGTCAGCAG 873 
nh I " I I M I I I I I , 

482 TCTTCGGACAGAGATGTTTGGAG 504 

RESULT 10 
US-09-398-496-3 

Sequence 3, Application US/093984 96 
Patent No. 6133423 
GENERAL INFORMATION- 

APPLICANT: Gearing, David P 
APPLICANT: Busfield, Samantha J. 

NUMBER OF SEQUENCES: 33 ^"^^KEFOR 

CORRESPONDENCE ADDRESS- 

ADDRESSEE: Fish s Richardson P C 

STREET: 225 Franklin Street 
^ili: Boston 

STATE: MA 

COUNTRY; US 

ZIP: 02110-2804 

COMPUTER READABLE FORM- 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 

Db 



SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398 496 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753 007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699 591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION- 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER:' 07334/022001 
TELECOMMUNICATION INFORMATION- '^^^^^"^ 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX : 

INFORMATION FOR SEQ ID NO: 3- 
SEQUENCE CHARACTERISTICS- 
LENGTH: 1607 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: Coding Sequence 
LOCATION; 7 9... 621 
OTHER INFORMATION- 
US-09-398-496-3 

Query Match 45 no . ■ ^ 

442, c:c„=e.v„.v, 0, Mismatches 61, I„aels 0, G,p, 



Qy 


371 


Db 


2 


Qy 


431 


Db 


62 


Qy 


491 


Db 


122 


Qy 


551 


Db 


182 


Qy 


611 


Db 


242 


Qy 


671 ' 



UD 


302 


Qy 


731 


Do 


362 


Qy 


791 


Db^ 


422 


Qy 


851 


Db 


482 



TCCTTGGGAAG=ACACCGTGAGGGGCCGACTCCATGTCAACAGCGT«<;CACCACTCTGT 361 

ACACCGGGGACAGGTGTCAGCAG 873 
I Ml I I I I I I III 



RESULT 11 
US-08-753-007A-1 

Sequence 1, Application US/08753007A 
Patent No. 6074841 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P 
APPLICANT: Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 
TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 ^'^^^^OR 
CORRESPONDENCE ADDRESS- 

ADDRESSEE: Fish & Richardson P C 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/753 007A 
FILING DATE: 19-NOV-1996 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/ AGENT INFORMATION - 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: ' 07334/022001 
TELECOMMUNICATION INFORMATION- 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX : 

INFORMATION FOR SEQ ID NO: 1; 
SEQUENCE CHARACTERISTICS; 
LENGTH: 2467 base pairs 
TYPE: nucl eic acid 
STRANDEDNESS : single 



TOPOLOGY: circular 
; MOLECULE TYPE: cDNA 

FEATURE : 

; NAME/KEY: Coding Sequence 

LOCATION: 79... 1893 
; OTHER INFORMATION: 

US-08-753-007A-1 



Query Match 45.2%; Score 405.4; DB 3; Length 2467; 

Best Local Sxmilarity 87.9%; Pred. No. 3.2e-92- 

Matches 442; Conservative 0; Mismatches 61; Indels 0; Gaps 0; 

CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 
' I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 61 

GGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGA 4 9 0 
N I M I I I I I I I I I I M I ini I I I I I I I I III I I I II I II I I 11 I II I Mill I 

GGCCCAAGCTGAAGAAGATGAAGAGCCAGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCA 12 1 

AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 
' " " > > I I < I I > II II II M I II I II I II II II I I II II I I I I II I I II I II 
AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 1 8 1 

AGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGAC 610 
' ' I N M II II II II I I I I I I I I II I II I M I I I I I I II I I I I I I I I I M I 
AACTCAACCGGAGTCGTGATATTCGCATCAAGTATGGCAATGTCAGAAAGAACTCACGGC 2 4 1 

TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 670 

NIIIIIIMIII MM IMMIlll II MM MM Mill IMIIIIIIIIII 
TACAGTTCAACAAAGTGAGGGTGGAGGATGCCGGGGAGTACGTCTGTGAGGCCGAGAACA 3 0 1 

TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

INI II I I M II I M I M II II II I II I II II M I II II I II II II II MM 

TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 
I I I I I I I I I I > I I II I I I I I M II II I II Mill II M II I M II II II II II 
CATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTGTGTGAATG 4 2 1 

GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCCTGTGGGAT 8 5 0 
lllllll IIINMMMMMMMMMMMMMMIMM Mill Mil 

GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 4 8 1 

ACACCGGGGACAGGTGTCAGCAG 873 
I Ml I M II I III 



Qy 


371 


Db 


2 


Qy 


431 


Db 


62 


Qy 


491 


Db 


122 


Qy 


551 


Db 


182 


Qy 


611 


Db 


242 


Qy 


671 


Db 


302 


Qy 


731 


Db 


362 


Qy 


791 


Db 


422 


Qy 


851 . 


Db 


482 ' 



RESULT 12 
US-09-39B-496-1 

; Sequence 1, Application US/09398496 

; Patent No, 6133423 

; GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 

APPLICANT: Bus field, Samantha J. 
; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 



TITLE OF INVENTION: AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C 

STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/398,496 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/753,007 
FILING DATE: 19-NOV-1996 
APPLICATION NUMBER: 08/699,591 
; FILING DATE: 19-AUG-1996 

; ATTORNEY/AGENT INFORMATION:. 

; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
REFERENCE/DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: 

; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2467 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: circular 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: Coding Sequence 
LOCATION: 79... 1893 
OTHER INFORMATION: 
US-09-398-496-1 



Query Match 45.2%; Score 405.4; DB3; Length 2467- 

Best Local Similarity 87.9%; Pred. No. 3.2e-92- 

Matches 442; Conservative 0;, Mismatches 61; Indels 0; Gaps 0 
Qy 371 CCAACGGCT^ 

Db 2 CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACT^ 61 

Qy 431 GGCCCAAGTTGA^^ 

HK ""'I'l NIIIIIIIIMIIIIMIIM III I I III I I MM I nil I MM I 

Db 62 GGCCCAAGCTGAAGAAGATGAAGAGCCAGACAGGAGAGGTGGCTGAG^^ 121 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 



Db 


122 i 


Qy 


551 . 


Db 


182 


Qy 


611 


Db 


242 


Qy 


671 


Db 


302 


Qy 


731 


Db 




Qy 


791 


Db 


422 


Qy 


851 


Db 


482 



iiMilllllMI II II II IIIMMI II Ml II llllllllllllllllill 



hC 610 

I II I IN 111 I 111 Mill 1 MMIMIMIMM 1 

LilUliiLli.rLiUiicGciTCAAGTATGGCAATGTCAGA^^ 



TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 670 

illMIMMIM MM MMIIIM I' 

TACAGTTCAACAAAGTGAGGGTGGAGGATC 

TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAG^ 730 
I I I I I I I I I M I I I I I I I I i M I I I M I I I I I I I I M M I M I 1 1 I MM 

iiciTGiGiiiiiciiiiiGAGGGicCGACTCCATGTC^^^^^ 361 
CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCA^^ 790 

GAGicGiGiGiiiiiACAiciiGGGC^^ 



I I I M I III 



RESULT 13 
US-08-525-864A-5 

sequence 5, Application US/08525864A 
Patent No. 5912326 
GENERAL INFORMATION: 

"TLETF'iH™S™ = "'«re.eli™-c.=rlved Growth Factors, .„d U.,s 

TITLE OF INVENTION: Related thereto 
NUMBER OF SEQUENCES: 18 
CORRESPONDENCE ADDRESS: 

7U)DRESSEE: LAHIVE & COCKFIELD 
STREET: 28 State Street 
CITY: Boston 
STATE: Massachusetts 
COUNTRY: USA 
ZIP: 02109 
COMPUTER RE7VDABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: AscII (text) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/52 5 , 864A 
FILING DATE: 8-SEP-1995 
CLASSIFICATION: 530 
ATTORNEY /AGENT INFORMATION: 
NAME: Kara, Catherine J. 
REGISTRATION NUMBER: 41,106 
REFERENCE/ DOCKET NUMBER: HUI-017 



TELECOMMUNICATION INFORMATION: 
TELEPHONE: (617)227-7400 
TELEFAX: (617)742-4214 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1207 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 2 -.394 
US-08-525-864A-5 



M 24.1%; Score 216.2; DB 2; Length 1207; 

Query Match ^*±.-lo, a a^- 

Best Local Similarity 86.3%; P«ci- »»• .„.,i3 o, Gap. 0; 

Matches 239; Conservative 0; Mismatches 38, Indels 

5,7 A»G«.CTCACGACTACAGTTCA.C«CGT=AAGSTG,^^^^ 
657 CGAGGCCGAGAACATCCTGGGC»GCA»CCGTCCGGGGCCG^^^ 

ei .iil^^^ii^il^i^^ 

717 GAGCACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTC 776 
777 CTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGG«TC»CCAGCT^ 836 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

°' 

Db 241 ATG 



837 GTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 
II I 11 I I I III 



I I 1 Mill Ml I I 1 I 1 I I ' I 

ITCCAAACGGATTCTTCGGACAGAGATGTTTGGAG 



277 



RESULT 14 

US-08-036-555B-21 ^^^^^r. 
Sequence 21, Application US/08036555B 
Patent No. 5530109 
GENERAL. INFORMATION: 

applicant: Goodearl, Andrew; Stroobant ^-1; Marchioni, Mark; 

APPLICANT: Minghetti, Luisa; Watertie±a, 
APPLICANT: Chen, Maio Su; Hiles, Ian 
TITLE OF INVENTION: Glial Mitogenic Factors, Their 
TITLE OF INVENTION: Preparation and Use 
NUMBER OF SEQUENCES: 184 
CORRESPONDENCE 7VDDRESS : 

ADDRESSEE: Felfe & Lynch 
STREET: 805 Third Avenue 
CITY: New York City 
STATE: New York 
COUNTRY: USA 



ZIP: 10022 

COMPUTER READABLE FORM: ^ storage 

MEDIUM TYPE: Diskette, 5.25 incn, ^ou 

COMPUTER: IBM 
OPERATING SYSTEM: PC-DOS 
SOFTWARE: Wordperfect 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/036, 555B 
FILING DATE: 24-MAR-1993 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: ^ 

APPLICATION NUMBER: 07/965,1/^ 
FILING DATE: 23-OCT-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/940,389 
FILING DATE: 03-SEP-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/907,138 
FILING DATE: 30-JUN-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 
FILING DATE: 03-APRIL-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: U.K. 91 07566.3 
FILING DATE: lO-APRIL-1991 
ATTORNEY/AGENT INFORMATION: 
NAME: Tsai, Christine H. 
REGISTRATION NUMBER: 34,266 ^ 
REFERENCE/DOCKET NUMBER: LUD 5250.4 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (212) 688-9200 
TELEFAX: (212) 838-3884 
INFORMATION FOR SEQ ID NO: 21: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2003 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

FEATURE: _ . ^ oo could be either 

OTHER INFORMATION: N in positions 31 and 32 couia 



OTHER INFORMATION: 
US-08-036-555B-21 



or G. 



QY 
Db 

Qy 

Db 

Qy 



^ ^ 9 4%; score 84; DB 1; Length 2003; 
309 CTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACG GCCTT 356 

I n 1 1 1 1 1 1 1 1 1 11 I I I ' ' ' 



Db 

Qy 

Db 

Qy 

Db 



892 CTTCTTCATGGAGCCCGACGCCAACAGCACCAGCCGCGCGCCGGCCGCCTTCCGAGCCTC 941 

3„ tgcccccc.cg™cgg_c.c™ 

«2 i..iiiiicTCTGG.GicGi^icGGlici;ii;— GTC.GCC0GGXGCTGTGC«. 1001 
417 TGACTGCGCCACCCGGCCCAAGTTG»GAAGATGMGAGC«»CGGa^^^^ 476 
lOOa GCGGUiii^TTGicTi^icl^UiilAGiiii^-CAGGAATCGGCTGCAGGXTC lOSl 
0, 477 GAAGCAATCGCTGAAGTGTGAGGCAGCAGCCC^TAATCCCCAGCCTTCCTACC^^^ 536 

1062 JJ*i.;GTci;TCGiii;^icAOTTiTGAATACTCCTCTCTCAGAXTCAAGTGGTT 1121 
S37 CAAGGATGGCAAGGAGCTCAACCGC^GCCGAGACATTCGCA---TCAMTATGGCAACGG 593 
» ■ 1122 i]iU™GUU«iAiAAAACAAACCACAAAATATCAAGATACAAAAAAA 1181 

S9, CAGAAAGAAC.«CGA~TC— GAAGG.™C^ "3 
11S2 GCciGGi;J.G;iUiiT;:GCA;.i;;=iL.UcAC;iicTGATTC.GGAGAGTATA. 1241 



Db 

Qy 

Db 



654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC^^^ 713 
1242 GiiiAiAiTGATciGilAATiAiiAilTiiiiGTGCCTCTGCCAATATCACCATGGTGGA 1301 

714 CGTGAGCACCACCCTGXCATCC- .GGTCGGGGCACGCCCGGAAGTGCAACGAGAC 767 

1302 ATCALiGiATCTAiiiciACCACTGGGACAAGCCATCTTGTAAAATGTGGGGAGAA 1361 
768 AGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGK^ 
I 1362 GiAGiiAAi.i«;i.UGli;ii;iUAG;ii;.iA.GG;GA^iAici.TCAAACCC 1421 

828 C.C CXGCAAGTGXCCTGTGGGA»CAC^ 

1422 iiiGAGATAGTTGiiiiiiiiciiAAATGAGTTTAGTGGTGATCGCTGCCAAAACTACGT 1481 

Ov 879 AATGGTCAACTTCTCC 894 

Mill 11 II Ml I 
Db 14 82 AATGGCCAGCTTCTAC 1497 

RESULT 15 

US-08-469-569-21 ,^o.^ac^Q 
Sequence 21, Application US/08469569 

Patent No. 5606032 
GENERAL INFORMATION: v. t- PanT 

applicant: Goodearl, Andrew; ftroobant ^^ul^^ Marchioni, Mark; 
APPLICANT: Minghetti, Luisa; Waterfield, Michael, Marcn 
APPLICANT: Chen, Maio Su; Hiles, ^^"^ 
TITLE OF INVENTION: Glial Mitogenic Factors, Their 
TITLE OF INVENTION: Preparation and Use 
NUMBER OF SEQUENCES: 184 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Felfe & Lynch 
STREET: 805 Third Avenue 



CITY: New York City 
STATE: New York 
COUNTRY: USA 
ZIP: 10022 

COMPUTER READABLE FORM: o^o ^toraae 

MEDIUM TYPE: Diskette, 5.25 inch, 360 kb storage 

COMPUTER: IBM 
OPERATING SYSTEM: PC-DOS 
SOFTWARE : Wordperf ect 
CURRENT APPLICATION DATA: , 

APPLICATION NUMBER: US/ 08/4 69 , 569 
FILING DATE: 06-JUN-1995 
CLASSIFICATION: 530 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/036,555 
FILING DATE: 24-MAR-1993 
APPLICATION NUMBER: 07/965,173 
FILING DATE: 23-OCT-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/940,389 
FILING DATE: 03-SEP-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/907,138 
FILING DATE: 30-JUN-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 
FILING DATE: 03-APRIL-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER:. U.K. 91 07566.3 
FILING DATE: lO-APRIL-1991 
ATTORNEY/AGENT INFORMATION: 
NAME: Tsai, Christine H. 
REGISTRATION NUMBER: 34,2 66 
REFERENCE/DOCKET NUMBER: LUD 5250.4 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (212) 688-9200 
TELEFAX: (212) 838-3884 
INFORMATION FOR SEQ ID NO: 21: 
SEQUENCE CHTU^CTERISTICS: 
LENGTH: 2 003 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 

'"^JhER information: N in positions 31 and 32 could be either 

OTHER INFORMATION: A or G. 
US-08-469-569-21 

Query Match 9.4%; Score 84; 1; Length 2003; 

Best Local Similarity f^-^*' ,^^^^-3^:;i;r345; Indels 30; Gaps 4; 
Matches 361; Conservatxve 0, Mismatches 



VCAAGTGGCCGCTCCGGAGCGGGGGGCTGCA 2 4 8 
189 GGGTCGGGT(itii-(ji i'^'J-L-f-k^^'^'-^^"^ — -\<-rtrvvjj. ... 
, . I I I I I I 1 M I I I I I I 111 



GTCGGGTGGCGTTGGTAAAGGTGCTGGAC 

TTCAiGiiGGACTiGCTGCTCACCGTGCGCCTGGG 821 



Db 762 GGTGTGGGCGGTGAAAGCCGGGGGCT 

249 GCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACAT 308 



B22 i.C^TcUcC.CCciGCCTicCCciciii™c;~ 

309 CTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACG GCCTT 356 



Db 

Qy 

Db 

i mil I Ml 11 INI 



882 iUci;k;iGiiiica.cicJ.c..kccASCcGC«c=ccGSCcaccTTCc»GCCTC Ma 



41, TGACTGCOCCACCCGGCCCA*GTTGAA3«GATG»GAGCCAC^CG0GA«GGTGG^^ 476 
lOOa GCGG;iiiiiTTGicTiiiJ.;iGliAGlii;iiUii^GG»TCGGGTG»GG..C .06. 

4„ GAAGCAATCGCTGAAGTGTGAGGCAGCAGCCO^TAATCCCCAGCCTTCCTAG^ 536 
.062 c;J..iTLTci;TCG;U;iiAAi™UACTCCTCTGTCAGAXTCAAGTGGTT 1121 



537 CAAGGATGGCAAGGAGCTGAACCGCAGCCGAGACATTCGCA---TCAAATATGGCAAC 593 
1122 iiiUi— UiG^^TiLJ^CAAACCACAAAATATCAAGATACAA^^ Hei 

S,, CAGAAAGAAC.«CGA~TC~TG«^^^ 
1182 GCciGGii;G;iiGAiiiTiGCA;.iiii;«UcACTGGGTGAT.CTGGAGAGTA™ 12,1 

65, g.gc3aggccgag~.gg™»ccgxccggggccggg™ V13 

12,2 GiiiA^iTGATcLiUUJiTiiiiGTOCCTCTGCCAATATCACCATCGTCGA 1301 
,1, CGTGAGGACCACCCTGT»TCG------TGGTCGGGGCACGCCCG_ ^67 

1302 ATCAiAiGi™iATCTAiiiicACGACTGGGACAAGCaATCTTGTAAAATGTGCGGAGAA 1361 
768 AGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTAGTACATCGA3GG 
Z 1362 GiAGiLAi.i.C;^.i;GiiUiiiiGUii;xiA.GG;GA;Akic;..CAAACCC 1,21 

828 ..C---------CXGCAAG.GTCCTGXG— CCG^^^^^^ 

1422 iiiGAGATACTTGUi]ii™ciiAAATiAG!TTACTGGTGA.CGCTGCCAAAACTACGT 1,81 

OV 87 9 AATGGTCAACTTCTCC 894 

^ nil I II II Ml I 

Db 1482 AATGGCCAGCTTCTAC 1497 

Search completed: January 14, 2004, 10:25:25 
Job time : 58.4093 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen 



OM nucleic - nucleic search, using sw model 

9118.095 Million cell updates/sec 



January 14, 2004, 10:23:13 ; Search time 346 751 Seconds 

(without alignments) 



Title- US-09-864-675-3 

IT^lTjr'" rat,.„c,c,,c„,,cc=c =aat,,t=..ctt«ccta, 

scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 2324096 seqs, 1762381658 residues 

Total nuinber of hits satisfying chosen parameters: 4648192 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

n^i-ahase • Published Applications_NA: * 

Database . ^_ ^ 6/ptodata/l/pubpna/US07_PUBCOMB. seq 

2- /cgn2'6/ptodata/l/pubpna/PCT_NEW_PUB.seq: 

3- /cqn2~6/ptodata/l/pubpna/US06_NEW_PUB. seq: * 

4- /cqn2~6/ptodata/l/pubpna/US06_PUBCOMB.seq:* 
5*. /cgn2"'6/ptodata/l/pubpna/US07_NEW_PUB.seq:* 
6- /cgn2"6/ptodata/l/pubpna/PCTUS_PUBCOMB.seq:* 
7 • /cgn2~6/ptodata/l/pubpna/US08_NEW_PUB. seq: * 
8* /cqn2"6/ptodata/l/pubpna/US08_PUBCOMB.seq:* 

9- /cgn2"6/ptodata/l/pubpna/US09A_PUBCOMB.seq:* 

10- /cgn2 6/ptodata/l/pubpna/US09B_PUBCOMB. seq: 

11- /cqn2"6/ptodata/l/pubpna/US09C_PUBCOMB. seq: 

12- /cgn2"6/ptodata/l/pubpna/US09_NEW_PUB.seq:- 

13- /cgn2~6/ptodata/l/pubpna/US09_NEW_PUB.seq2:* 
14. /cgn2~6/ptodata/l/pubpna/USlOA_PUBCOMB.seq:* 

15- /cqn2~6/ptodata/l/pubpna/US10B_PUBCOMB.seq:* 

16- /cgn2~6/ptodata/l/pubpna/USlO_NEW_PUB-seq:* 
17" /cgn2"6/ptodata/l/pubpna/US60_NEW_PUB.seq: 
is': /cgn2:6/ptodata/l/pubpna/US60_PUBCOMB.seq:* 

lit. S;at:.'^L"'t",:L".r^.^: 

derived by analysis o£ the total score distribution. 

SUMMARIES 

% 

Result Query Description 

No, Score Match Length DB ID 
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ALIGNMENTS 



RESULT 1 
US-09-864-675-3 

; Sequence 3, Application US/09864675 

; Patent No. US20020081286A1 

; GENERAL INFORMATION: 

; APPLICANT: Marchionni, Mark 



TITLE OF INVENTION: NRG-2 NUCLEIC ACID MOLECULES, 

TITLE OF INVENTION: POLYPEPTIDES, AND DIAGNOSTIC AND THERAPEUTIC METHODS 
FILE REFERENCE: 04585/049002 
CURRENT APPLICATION NUMBER: US/ 09/8 64 , 67 5 
CURRENT FILING DATE: 2001-05-23 . 
PRIOR APPLICATION NUMBER: US 60/206,495 
PRIOR FILING DATE: 2000-05-23 
NUMBER OF SEQ ID NOS : 18 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 3 
LENGTH: 897 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-864-675-3 

Query Match 100.0%; Score 897; DB 9; Length 897; 

Best Local Similarity 100.0%; Pred. No. 5e-243; ^ n 

Matches 897; Conservative 0; Mismatches 0; Indels 0; Gaps U; 

1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I M I I I I I I I M I I I I I I I I I M M M I I I I I I I M I I I I I I I I I I I I I M M M 

Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 



Qy 



60 
120 
120 



61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 

^ I I I I I I I M I I I M I I I I I I I I I M M M I N I I I I I I I I I I 1 M M M I N N I I I I I I 

Db 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 

Ov 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTGCT^ACAGCACCCGAGAGCCG 180 

III II II II II lllllllllllMllllllllMlllllllllllirilllMII 

Db 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 

^ I I I I I I I M I I 1 I I I I I I I I M M I I I I I I I I I I I I M M I I I II I I M I I I I 1 I I I I I I 

CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 



Db 181 



Ov 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 

^ II I I I II II I I I II I 1 I I I I I I I I M I I I I I I I I 1 M I M I I I I I I I 1 I I I I M M I M I 

GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 



Db 241 
Qy 301 



CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 

I I I I I I I I I I I I I I I II I I I II II I I I II II II M M M M I N I I I II I I II I M II II 

301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTT7\AGACGGCCTTTGCC 



Qy 361 
Db 361 



CCCCTCGATACC7VACGGCA7W\ATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 

I I I I I II I I I I II i I I I I I I I I I I I I I I I II II I I I II I I M I I I I I I I II I I I I I I I M 

CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 



421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 
I I I I I I I I I I I I I I 11 I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 

OV 4 81 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 

^ II I I II I I I I I II I I I I I I 1 1 I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 



Qy 541 



GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 
I I I I I I I I I I I I I I I I I I I I I I M I I' I I I I I I I I I I I I M M I I I I I I I I I I I I M I I I I 



240 
240 
300 
300 
360 
360 
420 
420 
480 
480 
540 
540 
600 





541 


GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 


600 




601 


AACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 

1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M M 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 

7\ACTCACGACTACAGTTC7\ACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 


660 




601 


660 






GCCGAGAACATCCTGGGGTU^GGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 

1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 


720 


UD 


vj u X 


720 


uy 


721 


ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 

1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M M 

ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 


780 




721 


780 


Qy 


781 


TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 

I 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M M 1 1 N N 

TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 


840 


Db 


781 


840 


Qy 


841 


CCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCAATGGTCAACTTCTCCTAA 897 

1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 M 1 1 1 1 

CCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGCTyVTGGTCAACTTCTCCTAA 897 


Db 


841 



RESULT 2 
US-09-864-675-1 

Sequence 1, Application US/09864675 
Patent No. US20020081286A1 
GENER7VL INFORMATION: 
APPLICANT: Marchionni, Mark 

TITLE OF INVENTION: NRG-2 NUCLEIC ACID MOLECULES, 

TITLE OF INVENTION: POLYPEPTIDES, AND DIAGNOSTIC AND THERAPEUTIC METHODS 
FILE REFERENCE: 04585/049002 
CURRENT APPLICATION NUMBER: US/09/ 8 64 , 675 
CURRENT FILING DATE: 2001-05-23 
PRIOR APPLICATION NUMBER: US 60/206,495 
PRIOR FILING DATE: 2000-05-23 
NUMBER OF SEQ ID NOS : 18 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 1 
LENGTH; 994 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-09-864-675-1 

Query Match 94.6%; Score 849; DB 9; Length 994; 

Best Local Similarity 98.3%; Pred. No. 1.8e-229; 

Matches 858; Conservative 0; Mismatches 15; Indels 0; Gaps 0; 



Qy 



1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

II II II II I M I I I I I M II I II II Ml I I I I I I I I I IN I M I I II I I II I M I 

Db 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 



Qy 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 12 0 

I II I I I I M II I I M I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I I II I I I I I I M I I 
Db 61 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 



Qy 



121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 



I II I I M M I I I I I I I I I I I I I M I I I I I I I I I i I I I I I I I I I I I I I I M I I I I I I I I I I 

Db 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 180 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I I I I I I 

Db 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 24 0 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 

Db 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGATyVGGAACCAG 300 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

Qy 361 CCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 42 0 

I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I M I I I N M I I I I I I I I I I I I I I I 

Db 361 CCCCTCGATACCAACGGCAAAAATCTCAAGTWVGAGGTGGGCAAGATCCTGTGCACTGAC 420 

Qy 421 TGCGCCACCCGGCCCAAGTTGAAGTy^GATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M M I I I I 
Db 421 TGCGCCACCCGGCCCTyVGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 48 0 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I I I I I I M I M I I I I I I I I I I I I I I I M M I I I I I I I I I I I I M I M I I I I I I I I I I I I 

Db 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I M I I I I I I I I I I I I I I M M I I 

Db 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

Qy 601 AACTCACGACTACAGTTCAAC7VAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I M I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M 

Db 601 7\ACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

I I I I I I i I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I i I I I I I I I M I M I I I I 

Db 661 GCCGAGTy^CATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 72 0 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 78 0 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 84 0 

I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I M I I I I I I III 

Db 7 81 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 84 0 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II Mill III I I I I I I III 

Db 841. CC7\AATGGATTCTTCGGACAGAGATGTTTGGAG 873 



RESULT 3 
US-10-096-241-5 

; Sequence 5, Application US/10096241 
; Publication No. US20020127594A1 
GENERAL INFORMATION: 



APPLICANT: Gearing David P. 

Bus field, Samantha J. 
TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
CITY: Boston 
STATE : MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096, 241 
FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
. REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1884 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: Coding Sequence 
LOCATION: 664. ..1883 
OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
US-10-096-241-5 

Query Match 93.1%; Score 835.4; DB 14; Length 1884; 

Best Local Similarity 98.1%; Pred. No. 1.4e-225; 

Matches 856; Conservative 0; Mismatches 16; Indels 1; Gaps 1; 
Qy. 1 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 60 

I I I I I I I I M I i I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I M M M I I I I I I I I 

Db 218 ATGAGGCGCGACCCGGCCCCCGGCTTCTCCATGCTGCTCTTCGGTGTGTCGCTCGCCTGC 277 

Qy 61 TACTCGCCCAGCCTCT^GTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 120 

I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I M I I I I I I I I I I I I I 

Db 278 TACTCGCCCAGCCTCAAGTCAGTGCAGGACCAGGCGTACAAGGCACCCGTGGTGGTGGAG 337 



Qy 121 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 18 0 

I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

Db 338 GGCAAGGTACAGGGGCTGGTCCCAGCCGGCGGCTCCAGCTCCAACAGCACCCGAGAGCCG 397 

Qy 181 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 240 

I I I M i I I I I I I I I I I I I I I M I I M I I M MM I i I M M I I I I I I I I I MM 

Db 398 CCCGCCTCGGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGG 457 

Qy 241 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 300 

I I I I I I I I I I I I I M M I I I I I I I I I M M M M I M I I M I I I I I M M I M I I M M I 

Db 458 GGGCTGCAGCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAG 517 

Qy 301 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I M I M M M M M I I I I M I M M M M I I M M I 

Db 518 CGCTACATCTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACGGCCTTTGCC 577 

Qy 361 CCCCTCGATACC7\ACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGAC 42 0 

I I I I I I I I I I I I I I I I I II M I I I I I M I M I I I M I M M MM I MM I M M M I 

Db 57 8 CCCCT-GATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGGC 636 

Qy 421 TGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAG 480 

I I I M I I M I I II I M II M I II II M I I I I II I M I I I I I I I M M I I I M II I I I II I 

Db 637 TGCGCCACCCGGCCCT^GTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGTVAG 696 

Qy 481 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 54 0 

I I II I II I I II I I I I I I M I I I II I M I II I II I I I I II I M I I I I I I I I II I I I 

Db 697 CAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAG 756 

Qy 541 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG 600 

I II I I I I I I I II I II II 11 II I I I II I I II I I I I I I I I I II I II I II II I I I I I II I I I I 
Db 757 GATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCT^ACGGCAGAAAG 816 

Qy 601 AACTCACGACTACAGTTCAACAAGGTGTVAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 660 

I I I I I II II I II I II II I M I I II II II I I I I I I I I I II II II I I I I II I I II I I I I I I I 
Db 817 7VACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAG 87 6 

Qy 661 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 720 

I I I I I I I I I I I I I I M II I I II II I II II I II M I II II I I I I I II I I I I I I I I II I I II 
Db 877 GCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGC 936 

Qy 721 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTAT 780 

I II I I I I I II II I I II I II II I II II I II I II I I II I I II I II II II I I I M I I M I I I I 
Db 937 ACCACCCTGTCATCCTGGTCGGGGCACGCCCGGTy^GTGCAACGAGACAGCCAAGTCCTAT 996 

Qy 781 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGT 840 

I I II I I I I I II I II I I M I II M I II II I I I I I I I I II II II I I I II II II I II I I III 

Db 997 TGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGT 1056 

Qy 841 CCTGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

II I I II I I I I I II II I III 

Db 1057 CCAAATGGATTCTTCGGACAGAGATGTTTGGAG 1089 



RESULT 4 
US-10-096-241-7 

; Sequence 1, Application US/10096241 



; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, -David P.. 

; Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fish & Richardson P.C. 

; STREET: 225 Franklin Street 

CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 2.0 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096,241 
FILING DATE: 12-Mar-2002 
CLASSIFICATION: <Unknown> 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

; NAME: Fasse, J. Peter 

; REGISTRATION NUMBER: 32,983 

; REFERENCE/ DOCKET NUMBER: 07334/022001 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
; TELEX: <Unknown> 

INFORMATION FOR SEQ ID NO: 7: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 1476 base 'pairs 
; TYPE: nucleic acid 

; STRANDEDNESS: single 

; TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: Coding Sequence 
; LOCATION: 69... 1475 

OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
US-10-096-241-7 

Query Match 47.7%; Score 427.8; DB 14; Length 1476; 

Best Local Similarity 89.8%; Pred. No. 1.4e-110; 

Matches 459; Conservative 0; Mismatches 52; Indels 0; Gaps 

Qy 363 CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCTW^GATCCTGTGCACTGACTG 

M I I I I I III I I I I I I I I II I I II 

Db 98 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 



423 CGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 



M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 

Db 158 AGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGC7\ACGGCAGAAAGAA 602 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 27 8 TGGCT^GGAGCTCAACCGCAGCCGAGACATTCGCATCT^AATATGGCAACGGCAGAAAGAA 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 338 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTC7\ACAGCGTGAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 

Qy 783 CGTCT^TGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCC 842 

I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I 

Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 TGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

I I I I I Ml I I I I I I III 

Db 57 8 AAATGGATTCTTCGGACAGAGATGTTTGGAG 608 



RESULT 5 

US-10-096-241-31 

Sequence 31, Application US/10096241 
Publication No. US20020127594A1 
GENERAL INFORMATION: 

APPLICTUaT: Gearing, David P. 

Busfield, Samantha J. 
TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE 7VDDRESS : 

ADDRESSEE: Fish & Richardson P.C. 
STREET: 225 Franklin Street 
CITY: Boston 
STATE: MA 
COUNTRY: US 
ZIP: 02110-2804 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096, 241 
FILING DATE: 12-Mar-2002 



CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
NAME: Fasse, J. Peter 
REGISTRATION NUMBER: 32,983 
REFERENCE/ DOCKET NUMBER: 07334/022001 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX : <Unknown> 
INFORMATION FOR SEQ ID NO: 31: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2268 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: Coding Sequence 
LOCATION: 69. . .2009 
OTHER INFORMATION: 
SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
US-10-096-241-31 

Query Match 47.7%; Score 427.8; DB 14; Length 2268; 

Best Local Similarity 89.8%; Pred. No. 1.6e-110; 

Matches 459; Conservative 0; Mismatches 52; Indels 0; Gaps 0; 

Qy 363 CCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTG 422 

I I I I I I I III I I M M I I II I I II 

Db 98 CCGCGGCAAGAAGCACCCAGAGGGGAGGAAGCGGGAGAGGGAGCCCGATCCCGGGGAGAA 157 

Qy 423 CGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 4 82 

I I II M I II I II I I II II II I II II I M II I II I II II I I I I I I I 1! M I II II I II II 
Db 158 AGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCA 217 

Qy 4 83 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 542 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 218 ATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGA 277 

Qy 543 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAA 602 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I LI I I I I 
Db 278 TGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAG7\A 337 

Qy 603 CTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 662 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I 
Db 338 CTCACGACTACAGTTCAACTVAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGC 397 

Qy 663 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 722 

I I I I I I I I I I II I I I I I I I I II I I I II I I I I II II I M I II I I I I I I I II II II I I M M 

Db 398 CGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCAC 457 

Qy 723 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 782 

I I I I I I I I I I I I I II I I I I I I I III II I I M I I M M I I I I I I I I I M I I I M I I I I I I I 

Db 458 CACCCTGTCATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTG 517 



Qy • 783 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAA.GTGTCC 842 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I M I I I I I 
Db 518 CGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCC 577 

Qy 843 TGTGGGATACACCGGGGACAGGTGTCAGCAG 873 

I I I I I III I II I II III. 
Db 57 8 AAATGGATTCTTCGGACAGAGATGTTTGGAG 608 



RESULT 6 
US-10-096-241-3 

; Sequence 3, Application US/10096241 
; Publication No. US20020127594A1 

GENERAL INFORMATION: 
; APPLICANT: Gearing, David P. 

; Bus field, Samantha J. 

TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Fish & Richardson P.C. 

STREET: 225 Franklin Street 

CITY: Boston 

STATE: MA 

COUNTRY: US 

ZIP: 02110-2804 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

S0FTW7VRE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/096,241 

FILING DATE: 12-Mar-2002 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 

FILING DATE: 19-AUG-1996 
; ATTORNEY/AGENT INFORMATION: 

NTVME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/ DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
; TELEFAX: 617-542-8906 

TELEX: <Unknown> 
; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1607 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

; NAME/ KEY: Coding Sequence 

; LOCATION: 79. , . 621 



; OTHER INFORMATION: 

; SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

US-10-096-241-3 



Query Match 45.2%; Score 405.4; DB 14; Length 1607; 

Best Local Similarity 87.9%; Pred. No. 3e-'104; 

Matches 442; Conservative 0;. Mismatches 61; Indels 0; Gaps 0; 

Qy 371 CCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 CTAACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 61 

Qy 4 31 GGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGA 4 90 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III II II I II I I I I I I I II I I I I I I 

Db 62 GGCCCAAGCTGAAGAAGATGAAGAGCCAGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCA 121 

Qy 4 91 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

I I I I I I I I I I I I I II II II I I I I I I II I I I II II I I I I II I I I I I I I II I I I I 
Db 122 AGTGTGAGGCAGCGGCGGG7V7\ACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGAC 610 

I II I I I I M M II II I I II I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I 

Db 182 AACTCTy^CCGGAGTCGTGATATTCGCATCAAGTATGGCAATGTCAGAAAGAACTCACGGC 241 

Qy 611 TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 67 0 

I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I M I II I II I I I I I I I I I I I I I 

Db 242 TACAGTTCAACAAAGTGAGGGTGGAGGATGCCGGGGAGTACGTCTGTGAGGCCGAGAACA 301 

Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I M I I M I I I I I M I I II I I II I I I M I I I I I I I I I I I I 

Db 302 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 

I I I II I II I I I I I M II I I I I II I I I II I I I I II I I I I I I I I I I I II II I I I I 

Db 362 CATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTGTGTGAATG 421 

Qy 7 91 GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCCTGTGGGAT 850 

I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I 

Db 422 GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCT^AATGTCCAAACGGAT 481 

Qy 851 ACACCGGGGACAGGTGTCAGCAG 873 

I I I I I I I I M I I I 
Db 4 82 TCTTCGGACAGAGATGTTTGGAG 504 



RESULT 7 
US-10-096-241-1 

; Sequence 1, Application US/10096241 
; Publication No. US20020127594A1 
GENERAL INFORMATION: 

APPLICANT: Gearing, David P. 
; Busfield, Samantha J. 

; TITLE OF INVENTION: DON-1 GENE AND POLYPEPTIDES 

AND USES THEREFOR 
NUMBER OF SEQUENCES: 33 
; CORRESPONDENCE ADDRESS: 

; 7VDDRESSEE: Fish & Richardson P.C. 



; . STREET: 225 Franklin Street 

; CITY: Boston 

; STATE: MA 

COUNTRY: US 
; ZIP: 02110-2804 

; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 1 0/096, 24 1 
FILING DATE: 12-Mar-2 002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/699,591 
FILING DATE: 19-AUG-1996 
ATTORNEY/AGENT INFORMATION: 
; NAME: Fasse, J. Peter 

REGISTRATION NUMBER: 32,983 
; REFERENCE/ DOCKET NUMBER: 07334/022001 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-542-5070 
TELEFAX: 617-542-8906 
TELEX: <Unknown> 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CRARACTERISTICS: 

LENGTH: 2467 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS: single 

; TOPOLOGY: circular 

MOLECULE TYPE: cDNA 
FEATURE : 

; NAME/KEY: Coding Sequence 

LOCATION: 79... 1893 
; OTHER INFORMATION: 

; SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

US-10-096-241-1 

Query Match 45.2%; Score 405.4; DB 14; Length 2467; 

. Best Local Similarity 87.9%; Pred. No. 3.3e-104; 

Matches 4 42; Conservative 0; Mismatches 61; Indels 0; Gaps 0; 

Qy 371 CCAACGGCA7!^AAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 430 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I I I I 

Db 2 CT7\ACGGCAAAAACATCAAGAAAGAGGTGGGCAAGATCCTGTGCACTGACTGCGCCACCC 61 

Qy 431 GGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAATCGCTGA 4 90 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I II I II I II II I II I II I I I I 

Db 62 GGCCCAAGCTG7VAGAAGATGAAGAGCCAGACAGGAGAGGTGGGTGAGAAGCAGTCGCTCA 121 

Qy 491 AGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGG 550 

II I II I I II II II II II II 11111111 II I II II II I I I I II I I II I I I II I I 

Db 122 AGTGTGAGGCAGCGGCGGGAAACCCCCAGCCCTCCTATCGCTGGTTCAAGGATGGCAAGG 181 

Qy 551 AGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAGAAAGAACTCACGAC 610 

I II I I I I I I II II II I I I I I II II I I II II II II I I II I II I M I II II I I 



Db 182 AACTCAACCGGAGTCGTGATATTCGCATCAAGTATGGCAATGTCAGAAAGAACTCACGGC 241 

Qy 611 TACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAACA 67 0 

I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I I I 

Db 242 TACAGTTCAACAAAGTGAGGGTGGAGGATGCCGGGGAGTACGTCTGTGAGGCCGAGAACA 301 



Qy 671 TCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGT 730 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I II I I I I I I I II I II I I I I I I 

Db 302 TCCTTGGGAAGGACACCGTGAGGGGCCGACTCCATGTCAACAGCGTGAGCACCACTCTGT 361 

Qy 731 CATCCTGGTCGGGGCACGCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATG 790 

I I I I I I I I II I M II I I M I I I I I I I I I I I I I II I I I I I I I I I I I II II I II I 

Db 362 CATCCTGGTCGGGACATGCCCGGAAGTGCAATGAGACCGCCAAGTCCTACTGTGTGAATG 421 

Qy 791 GAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAGTGTCCTGTGGGAT 850 

I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 422 GAGGCGTGTGCTACTACATCGAGGGCATCAACCAGCTCTCCTGCAAATGTCCAAACGGAT 4 81 



Qy 851 ACACCGGGGACAGGTGTCAGCAG 873 

I III I I I I I I III 

Db 482 TCTTCGGACAGAGATGTTTGGAG 504 



RESULT 8 

US-10-029-386-26613 

Sequence 26613, Application US/10029386 
Publication No, US20030194704A1 
GENERAL INFORMATION: 
APPLICANT: Penn, Sharron G- 
APPLICANT: Rank, David R. 
APPLICANT: Hanzel, David K. 

TITLE OF INVENTION: HUMAN GENOME- DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

TITLE OF INVENTION: EXPRESSION ANTVLYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 

CURRENT APPLICATION NUMBER: US/ 10/ 02 9 , 386 
CURRENT FILING DATE: 2001-12-20 
NUMBER OF SEQ ID NOS : 342 8 8 

SOFTWTVRE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 26613 
LENGTH: 201 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

MAP TO CHR5.3 

EXPRESSED IN HEART, SIGNAL =0.55 
EXPRESSED IN ADULT LIVER, SIGNAL = 0.4 9 
EXPRESSED IN FETAL LIVER, SIGNAL - 0.72 
EXPRESSED IN BRAIN, 



OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 
US-10-029-386-2 6613 



SIGNAL =0.66 
SWISS PROT HIT: 014511, EV7VLUE 3.00e-2 9 
NT HIT: AF119152.1, EVALUE l.OOe-109 
EST HUMAM HIT: BF108794.1, EVALUE 3 . OOe-93 



Query Match 19.3%; Score 173; DB 13; Length 201; 

Best Local Similarity 100.0%; Pred. No. 6.9e-39; 

Matches 173; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 

Db 

Qy 

Db 

Qy 

Db 



424 GCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 483 

I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

27 GCCACCCGGCCCAAGTTGAAGAAGATG7\AGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 86 

484 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

I I I I I I I M I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I M I I I I I I I I 

87 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 146 

544 GGCiVAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG 596 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

147 GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAT^TATGGCAACGGCAG 199 



RESULT 9 

US-10-02 9-38 6-12 913 

Sequence 12913, Application US/10029386 
Publication No. US20030194704A1 
GENERAL INFORMATION: 
APPLICANT: Penn, Sharron G. 
APPLICANT: Rank, David R. 
APPLICANT: Hanzel, David K. 

TITLE OF INVENTION: HUMAN GENOME- DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 

CURRENT APPLICATION NUMBER: US/ 10/ 029 , 386 
CURRENT FILING DATE: 2001-12-20 
NUMBER OF SEQ ID NOS : 34288 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 12913 
LENGTH: 57 3 
TYPE : DNA 

ORGANISM: Homo sapiens 
FEATURE : 

MAP TO CHR5,3 

EXPRESSED IN HEART, SIGNAL =0.55 
EXPRESSED IN ADULT LIVER, SIGNAL = 0.4 9 
EXPRESSED IN FETAL LIVER, SIGNAL = 0.72 
EXPRESSED IN BRAIN, SIGNAL =0.66 
SWISSPROT HIT: 014511, EVALUE 2.00e-28 
NT HIT: AF119152.1, EVALUE O.OOe+00 
EST HUMAN HIT: BG996653.1, EVALUE l.OOe-108 



OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 

OTHER INFORMATION: 
US-lO-02 9-386-12913 



Query Match 19.3%; Score 173; DB 13; Length 573; 

Best Local Similarity 100.0%; Pred. No. 8.9e-39; 

Matches 173; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



Qy 

Db 



424 GCCACCCGGCCCT^AGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGAGAAGCAA 483 

I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

377 GCCACCCGGCCCAAGTTGAAGAAGATGT^GAGCCAGACGGGACAGGTGGGTGAGAAGCAA 436 



Qy 

Db 



4 84 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 543 

I I I I I I I I I I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I i I I I I I I M I I I I I I I I I I 
437 TCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTTCAAGGAT 496 



Qy 

Db 



544 GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG 596 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

4 97 GGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAACGGCAG 549 



RESULT 10 

US-10-029-38 6-2532/C 

Sequence 2532, Application US/10029386 
Publication No. US20030194704A1 
GENERAL INFORMATION: 
APPLICANT: Penn, Sharron G. 
APPLICANT: Rank, David R. 
APPLICANT: Hanzel, David K. 

TITLE OF INVENTION: HUMAN GENOME- DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 

CURRENT APPLICATION NUMBER: US/10/029 , 38 6 
CURRENT FILING DATE: 2001-12-20 



NUMBER OF SEQ ID NOS : 34288 

SOFTWARE: Annomax Sequence Listing Engine vers 
SEQ ID NO 2532 
LENGTH: 579 
TYPE : DNA 

ORGANISM: Homo sapiens 
FEATURE : 

MAP TO CHR5 . 1 
EXPRESSED IN HEART, 



1.1 



OTHER 


INFORMATION: 


; OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


; OTHER 


INFORMATION: 


; OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


US-IO-029- 


-386-2532 



3. 

= 3, 

.3 



SIGNAL =2.4 
EXPRESSED IN BONE MARROW, SIGNAL 
EXPRESSED IN ADULT LIVER, SIGNAL 
EXPRESSED IN PLACENTA, SIGNAL = 4. 
EXPRESSED IN HELA, SIGNAL =5.2 
EXPRESSED IN FETAL LIVER, SIGNAL = 3.4 
EXPRESSED IN BRAIN, SIGNAL =4,6 
NT HIT: AF119153.1, EVALUE O.OOe+00 
EST_HUM7\N HIT: BF108794.1, EVALUE 2 . OOe-57 
SWISSPROT HIT: 014511, EVALUE 2.00e-12 



Query Match 12.7%; 
Best Local Similarity 92.9%; 
Matches 130; Conservative 



Score 113.6; DB 13; 
Pred. No. 5.1e-22; 
0; Mismatches 9; 



Length 57 9; 
Indels 1; 



Gaps 



Qy 

Db 

Qy 

Db 



594 CAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGT 653 

I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

489 CAGAAAGTUVCTCACGACTACAGTTCAACAAGGTGT^GGTGGAGGACGCTGGGGAGTATGT 430 

654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCG-TCCGGGGCCGGCTTTACGTCAACA 712 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

42 9 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGCTCCGGGGCCGGCTTTACGTCAACA 370 



Qy 



Db 



713 GCGTGAGCACCACCCTGTCA 732 

III II I II I II 

369 GCGGTAGGTGGGCCCAGACA 350 



RESULT 11 



us- 10-02 9-38 6-16232 /c 

Sequence 16232, Application US/10029386 
Publication No. US20030194704A1 
GENERAL INFORMATION: 
APPLICANT: Penn, Sharron G. 
APPLICANT: Rank, David R. 
APPLICANT: Hanzel, David K. 

TITLE OF INVENTION: HUMAN GENOME- DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

TITLE OF INVENTION: EXPRESSION TU^JALYSIS TWO 
FILE REFERENCE: AEOMICA-X-2 

CURRENT APPLICATION NUMBER: US/10/029, 386 
CURRENT FILING DATE: 2001-12-20 
NUMBER OF SEQ ID NOS : 34288 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 16232 
LENGTH: 171 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE: 

MAP TO CHR5.1 

EXPRESSED IN HEART, SIGNAL =2.4 
EXPRESSED IN BONE MARROW, SIGNAL =3.9 
EXPRESSED IN 7VDULT LIVER, SIGNAL = 3.5 
EXPRESSED IN PLACENTA, SIGN7VL = 4.3 
EXPRESSED IN HELA, SIGNAL = 5.2 
EXPRESSED IN FETAL LIVER, SIGNAL =3.4 
EXPRESSED IN BRAIN, SIGNAL =4.6 
NT HIT: AF119153.1, EVALUE 7.00e-87 
SWISSPROT HIT: 014511, EVALUE 3.00e-13 
EST__HUMAN HIT: BF108794.1, EVALUE 5.00e-58 

Query Match 12.5%; Score 111.8; DB 13; Length 171; 

Best Local Similarity 97.6%; Pred. No. 1.2e-21; 

Matches 124; Conservative 0; Mismatches 2; Indels 1; Gaps 1 

Qy 594 CAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGT 653 

I I I I I I I I I I I I I I I I I I M I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 130 CAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGT 71 

Qy 654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCG-TCCGGGGCCGGCTTTACGTCAACA 712 

M I I I I I I I M I I I I I I I M I I I M I I I I I I I I I I I M I I I I I I I I I 

Db 70 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGCTCCGGGGCCGGCTTTACGTCAACA 11 

Qy 713 GCGTGAG 719 

III II 
Db 10 GCGGTAG 4 



; OTHER 


INFORMATION: 


; OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


; OTHER 


INFORMATION: 


; OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


; OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


OTHER 


INFORMATION: 


US-10-029- 


-386-16232 



RESULT 12 
US-09-373-658-71 

; Sequence 71, Application US/09373658 
; Publication No. US20030092900A1 
; GENERAL INFORMATION: 
; APPLICANT: Iruela-Arispe, Luisa 
APPLICANT: Hastings, Gregg A. 



APPLICANT: Ruben, Steven M. 
APPLICANT: Jonak, Zdenka L. 
APPLICANT: Trulli, Stephen H. 
APPLICANT: Fronwald, James A, 
APPLICANT: Terrett, Jonathan A. 

TITLE OF INVENTION: Methl and Meth2 Polynucleotides and Polypeptides 
FILE REFERENCE: 1488.1070006 
CURRENT APPLICATION NUMBER; US/ 09/373 , 658 
CURRENT FILING DATE: 1999-08-13 
NUMBER OF SEQ ID NOS : 125 
SOFTWARE: Patentin Ver, 2.0 
SEQ ID NO 71 
LENGTH: 1986 
TYPE: DNA 
ORGANISM: Unknown 
FEATURE: 

OTHER INFORMATION: Description of Unknown Organism: Unknown 
US-09-373-658-71 

Query Match 9.4%; Score 84; DB 11; Length 1986; 

Best Local Similarity 49.0%; Pred. No. 1.6e-13; 

Matches 361; Conservative 0; Mismatches 345; Indels 30; Gaps 4; 

Qy 189 GGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCA 24 8 

II MINI I .11 I I I I I I I I I I I I I I I I I I I 

Db 759 GGTGTGGGCGGTGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGG 818 

Qy 24 9 GCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACAT 308 

I I I I II I I II M I I I I I I I III I I I I I I 

Db 819 GACCTGGGGCCACCCCGCCTTCCCCTCCTGCGGGAGGCTCAAGGAGGACAGCAGGTACAT 878 

Qy 309 CTTTTTCCTGGAGCCCACGG7VACAGCCCTTAGTCTTTAAGACG GCCTT 356 

I I I I I I I I I I I I II I I I I III I I I I 

Db 87 9 CTTCTTCATGGAGCCCGACGCCAACAGCACCAGCCGCGCGCCGGCCGCCTTCCGAGCCTC 938 

Qy 357 TGCCCCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGC7\AGATCCTGTGCAC 416 

I I I I I I I IN I I I I I II I I I I I I I I II I I I I I I I II 

Db 939 TTTCCCCCCTCTGGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAA 998 

Qy 417 TGACTGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGA 47 6 

II I I I I I I I I I I I I I I I I I I I II I I I I I I I III 

Db 999 GCGGTGCGCCTTGCCTCCCCAATTGAAAGAGATGAAAAGCCAGGAATCGGCTGCAGGTTC 1058 

Qy 477 GAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTT 536 

I I I I II I I I I I I I II III I II I I II I 

Db 1059 CAAACTAGTCCTTCGGTGTGAAACCAGTTCTGAATACTCCTCTCTCAGATTCAAGTGGTT 1118 

Qy 537 CAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCA TCAAATATGGCAACGG 593 

I I I I I II I I I M I I I I I I I I I I I I I I I 

Db 1119 CAAGAATGGGTVATGAATTGAATCGAAATVAACAAACCACAAAATATCTyVGATACAAAAAAA 1178 

Qy 594 CAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGT 653 

I I I I I I I Mil I I I M I I III II I I I I I I I I I I I . 

Db 1179 GCCAGGGAAGTCAGAACTTCGCATTAACAAAGCATCACTGGCTGATTCTGGAGAGTATAT 1238 

Qy 654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAG 713 

Mill III I I I I I I M I II II I III 



Db 



1239 GTGCAAAGTGATCAGCAAATTAGGAAATGACAGTGCCTCTGCCAATATCACCATCGTGGA 1298 



Qy 714 CGTGAGCACCACCCTGTCATCC TGGTCGGGGCACGCCCGGAAGTGCAACGAGAC 767 

I I I I I I I I I I M l III I I I I I I I I 

Db 1299 ATCAAACGCTACATCTACATCCACCACTGGGACAAGCCATCTTGTAAAATGTGCGGAGAA 1358 



Qy 768 AGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCT 827 

I II II I I I I I I I I I II I I I I II I I I I I I II 

Db 1359 GGAGAAAACTTTCTGTGTGAATGGAGGGGAGTGCTTCATGGTGAAAGACCTTTCAAACCC 1418 



Qy 828 CTC CTGC7VAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGC 878 

III I I I I II I I I I I I II I I II I I I I I I I I I 

Db 1419 CTCGAGATACTTGTGCAAGTGCCCAAATGAGTTTACTGGTGATCGCTGCCAAAACTACGT 1478 



•Qy 879 AATGGTCAACTTCTCC 8 94 

I I I II M I I II I I 

Db 147 9 AATGGCCAGCTTCTAC 1494 



RESULT 13 
US-09-989-687-71 

; Sequence 71, Application US/09989687 
; Publication No. US20040002449A1 
; GENERAL INFORMATION: 

APPLICANT: Hastings, Gregg A. 
; APPLICANT: Ruben, Steven M. 

; TITLE OF INVENTION: Methl and Meth2 Polynucleotides and Polypeptides 

; FILE REFERENCE: 1488.107000D 

; CURRENT APPLICATION NUMBER: US/09/989,687 

; CURRENT FILING DATE: 2001-11-21 

; NUMBER OF SEQ ID NOS : 12 6 

; SOFTWARE: Patentin Ver. 2.0 

; SEQ ID NO 71 

LENGTH: 1986 

TYPE: DNA 

ORGANISM: Unknown 
; FEATURE : 

OTHER INFORMATION: Description of Unknown Organism: Unknown 
US-09-989-687-71 



Query Match - 9.4%; Score 84; DB 12; Length 1986; 

Best Local Similarity 49.0%; Pred. No, 1.6e-13; 

Matches 361; Conservative 0; Mismatches 345; Indels 30; Gaps 4; 

Qy 189 GGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCA 24 8 

II I I I I I I I II II I I I I Mill I II I I I I I 

Db 759 GGTGTGGGCGGTGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGG 818 

Qy 24 9 GCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGTWVGGTUVCCAGCGCTACAT 308 

I I I I II I I I II I I I I I I I I III I I I I I I 

Db 819 GACCTGGGGCCACCCCGCCTTCCCCTCCTGCGGGAGGCTCAAGGAGGACAGCAGGTACAT 878 



Qy 309 CTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACG GCCTT 356 

I M I I I I I I I I I I I I I I I III MM 

Db 879 CTTCTTCATGGAGCCCGACGCCAACAGCACCAGCCGCGCGCCGGCCGCCTTCCGAGCCTC 938 

Qy 357 TGCCCCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCAC 416 



I I I I I I I III II I II M II I I I I II II I I I I I I I II 

Db 939 TTTCCCCCCTCTGGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAA 998 

Qy 417 TGACTGCGCCACCCGGCCCAAGTTGAAGAAGATGTy^GAGCCAGACGGGACAGGTGGGTGA 47 6 

II I II I I I I I I I I I I I I I I II I I I I I II I I III 

Db 999 GCGGTGCGCCTTGCCTCCCCAATTGAAAGAGATG7\AAAGCCAGGAATCGGCTGCAGGTTC 1058 

Qy 477 GAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTT 536 

I I I I II I I I I II I II III I II I I II I 

Db 1059 CAAACTAGTCCTTCGGTGTGAAACCAGTTCTGAATACTCCTCTCTCAGATTCAAGTGGTT 1118 

Qy 537 CAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCA TCT^AATATGGCAACGG 593 

I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1119 CAAGAATGGGAATGAATTGAATCGAAAAAACAAACCACAAAATATCAAGATACAAAAAAA 117 8 

Qy 594 CAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGT 653 

I I II I I I I I I I I I I I I I I III 11 I I I I I I I I I I I 

Db 1179 GCCAGGGAAGTCAGAACTTCGCATTAAC7WVGCATCACTGGCTGATTCTGGAGAGTATAT 1238 

Qy 654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAG 713 

I I I I I I I I I I I I I I I I I II II I III 

Db 1239 GTGCAAAGTGATCAGCAAATTAGGAAATGACAGTGCCTCTGCCAATATCACCATCGTGGA 12 98 

Qy 714 CGTGAGCACCACCCTGTCATCC TGGTCGGGGCACGCCCGGAAGTGCAACGAGAC 767 

I I I I I I II I I I I I I M II II I I I I 

Db 1299 ATCAAACGCTACATCTACATCCACCACTGGGACAAGCCATCTTGTAAAATGTGCGGAGAA 1358 

Qy 768 AGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCT 827 

I II II II I I I I I I I I I I I I I I I 1 I I I I I II 

Db 1359 GGAGAAT^CTTTCTGTGTGAATGGAGGGGAGTGCTTCATGGTGAAAGACCTTTCAAACCC 1418 

Qy 82 8 CTC CTGCAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGC 87 8 

III I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 1419 CTCGAGATACTTGTGCAAGTGCCCAAATGAGTTTACTGGTGATCGCTGCCAAAACTACGT 1478 

Qy 879 AATGGTCAACTTCTCC 8 94 

I I I I I II I I I I I I 
Db 1479 7\ATGGCCAGCTTCTAC 1494 



RESULT 14 
US-08-736-019-21 

Sequence 21, Application US/08736019 
Publication No. US2 00302077 99A1 
GENER7VL INFORMATION: 



APPLICANT: Goodearl, Andrew 
APPLICANT: Stroobant, Paul 
APPLICANT: Minghetti, Luisa 
APPLICANT: Water field, Michael 
APPLICANT: Marchionni, Mark 
APPLICANT: Chen, Mario 
APPLICANT: Hiles, Ian 

TITLE OF INVENTION: GLIAL MITOGENIC FACTORS, THEIR 
TITLE OF INVENTION: PREPARATION AND USE 
NUMBER OF SEQUENCES: 189 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Clark & Elbing LLP 



STREET: 17 6 Federal Street 

CITY: Boston 
; STATE: Massachusetts 

COUNTRY: U.S.A. 
; ZIP: 02110 

;■ COMPUTER READABLE FORM: 

; MEDIUM TYPE: 3.5" Diskette, 1.44 Mb 

; COMPUTER: IBM Compatible Pentium 

OPERATING SYSTEM: Windows 95 
; SOFTWARE: FastSeq Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/736, 019 

FILING DATE: 22-OCT-1996 

CLASSIFICATION: 514 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/471, 833 

FILING DATE: 06-JUN-1995 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/036,555 

FILING DATE: 24-MAR-1993 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/965,173 

FILING DATE: 23-OCT-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/907,138 

FILING DATE: 30-JUN-1992 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 07/940,389 

FILING DATE: 03-SEP-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 07/863,703 

FILING DATE: 03-APR-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: UK 91 07566.3 

FILING DATE: lO-APR-1991 
; ATTORNEY/AGENT INFORMATION: 

NAME: Bieker-Brady, Kristina 

REGISTRATION NUMBER: 39,109 

REFERENCE/ DOCKET NUMBER: 04585/0 02 OOQ 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (617) 428-0200 

TELEF7\X: (617) 428-7045 

TELEX: 

INFORMATION FOR SEQ ID NO: 21: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 2003 

; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
FEATURE: 

; OTHER INFORMATION: N in positions 31 and 32 could be 

; OTHER INFORMATION: either A or G. 

US-08-736-019-21 



Query Match 9.4%; Score 84; DB 7; Length 2003; 

Best Local Similarity 49.0%; Pred. No. 1.6e-13; 

Matches 361; Conservative 0; Mismatches 345; Indels 30; Gaps 4 



Qy 18 9 GGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCA 24 8 

II I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 762 GGTGTGGGCGGTGAAAGCCGGGGGCTTGAAG7y\GGACTCGCTGCTCACCGTGCGCCTGGG 821 

Qy 24 9 GCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCG7WVGGAACCAGCGCTACAT 308 

I I I I II I I II I I I I II I I I III I I I II I 

Db 822 GACCTGGGGCCACCCCGCCTTCCCCTCCTGCGGGAGGCTC7\AGGAGGACAGCAGGTACAT 881 

Qy 309 CTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACG GCCTT 356 

I II I II I I I I I I I I I I I I III MM 

Db 8 82 CTTCTTCATGGAGCCCGACGCCAACAGCACCAGCCGCGCGCCGGCCGCCTTCCGAGCCTC 941 

Qy 357 TGCCCCCCTCGATACCAACGGCAAAAATCTCAAGAAAGAGGTGGGCAAGATCCTGTGCAC 416 

I 11111 I III I I I I I II I I I I II I I II I I II I I I I I 

Db 942 TTTCCCCCCTCTGGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAA 1001 

Qy 417 TGACTGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGA 47 6 

II I I II I II I I I I I I I I I I I I I I I I I I I I I III 

Db 1002 GCGGTGCGCCTTGCCTCCCCAATTGAAAGAGATGAAAAGCCAGGAATCGGCTGCAGGTTC 1061 

Qy 477 GAAGCAATCGCTG7UVGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTT 536 

I I I I II I I I I I I I II III I II INN 

Db 1062 CAAACTAGTCCTTCGGTGTGAAACCAGTTCTGAATACTCCTCTCTCAGATTCAAGTGGTT 1121 

Qy 537 CAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCA TCAAATATGGCAACGG 593 

I I M I I II I I I I I I I M I I I I I I I I M 

Db 1122 CAAGAATGGGAATGAATTGAATCGAT^AAAACAAACCACAAAATATCAAGATACAAAAAAA 1181 

Qy 594 CAGATVAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGT 653 

I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I 

Db 1182 GCCAGGGT^AGTCAGAACTTCGCATTAACAAAGCATCACTGGCTGATTCTGGAGAGTATAT 12 41 

Qy 654 CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAG 713 

I I I I I III I I I I I I I I I II II I III 

Db 1242 GTGCAAAGTGATCAGCAAATTAGGAAATGACAGTGCCTCTGCCAATATCACCATCGTGGA 1301 

Qy 714 CGTGAGCACCACCCTGTCATCC— TGGTCGGGGCACGCCCGGAAGTGCAACGAGAC 767 

I I I I I I I I I I III III I I I I I I I I 

Db 1302 ATCAAACGCTACATCTACATCCACCACTGGGACAAGCCATCTTGTAAAATGTGCGGAGAA 1361 

Qy 7 68 AGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCT 827 

I II II I I I I I M I I I I I I I I I I I I I I I I II 

Db 1362 GGAGAAAACTTTCTGTGTGAATGGAGGGGAGTGCTTCATGGTG7U\AGACCTTTCAAACCC 1421 

Qy 82 8 CTC — CTGCAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGC 87 8 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1422 CTCGAGATACTTGTGCAAGTGCCCAAATGAGTTTACTGGTGATCGCTGCCAAAACTACGT 14 81 

Qy 87 9 T^ATGGTCAACTTCTCC 8 94 

I I I I I I I I I I I I I 

Db 1482 AATGGCCAGCTTCTAC 14 97 



RESULT 15 
US-09-366-886-71 

; Sequence 71, Application US/09366886 



; Publication No, US2b030040465Al 

; GENERAL INFORMATION: 

; APPLICANT: Gywnne, David I. 

; APPLICANT: Mahanthappa, Nagesh K. 

; APPLICANT: Marchionni, Mark A. 

; APPLICANT: Bermingham-McDonogh, Olivia 

; APPLICANT: Goldin, Stanley M. 

; APPLICANT: McBurney, Robert N. 

; TITLE OF INVENTION: USE OF NEUREGULINS AS MODULATORS OF 

; TITLE OF INVENTION: CELLULAR COMMUNICATION 

; FILE REFERENCE: 04585/041005 

; CURRENT APPLICATION NUMBER: US/09/366, 8 8 6 

; CURRENT FILING DATE: 1999-08-04 

; PRIOR APPLICATION NUMBER: US 08/341,018 

PRIOR FILING DATE: 1994-11-17 
; NUMBER OF SEQ ID NOS: 87 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 71 

LENGTH: 2 003 

TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE: 

NAME/ KEY: CDS 

LOCATION: (2 65 ) . . . ( 1530 ) 
; NAME/KEY: variation 

LOCATION: (31) ... (32) 

OTHER INFORMATION: n can be a or g. 
US-09-366-886-71 

Query Match 9.4%; Score 84; DB 11; Length 2003; 

Best Local Similarity 49.0%; Pred. No. 1.6e-13; 

Matches 361; Conservative 0; Mismatches 345; Indels 30; Gaps 4; 



Qy 18 9 GGGTCGGGTGGCGTTGGTAAAGGTGCTGGACAAGTGGCCGCTCCGGAGCGGGGGGCTGCA 24 8 

II I I I I I I I II I I I I I I I I I I I I I I I I I I I 

762 GGTGTGGGCGGTGAAAGCCGGGGGCTTGAAGAAGGACTCGCTGCTCACCGTGCGCCTGGG 821 

Qy 249 GCGCGAGCAGGTGATCAGCGTGGGCTCCTGTGTGCCGCTCGAAAGGAACCAGCGCTACAT 308 

I I I I II I I I I I I I I Mill III I I I I I I 

822 GACCTGGGGCCACCCCGCCTTCCCCTCCTGCGGGAGGCTCAAGGAGGACAGCAGGTACAT 881 

Qy 309 CTTTTTCCTGGAGCCCACGGAACAGCCCTTAGTCTTTAAGACG GCCTT 356 

II I I I I I I I I I I I I I I I I III I I I I 

Db 8 82 CTTCTTCATGGAGCCCGACGCCAACAGCACCAGCCGCGCGCCGGCCGCCTTCCGAGCCTC 941 

Qy 357 TGCCCCCCTCGATACCAACGGCAAAT^TCTCAAGAAAGAGGTGGGCAAGATCCTGTGCAC 416 

I INN I III I I I I I I I I I I I I I n II I I I I I I I II 

Db 942 TTTCCCCCCTCTGGAGACGGGCCGGAACCTCAAGAAGGAGGTCAGCCGGGTGCTGTGCAA 1001 

Qy 417 TGACTGCGCCACCCGGCCCAAGTTGAAGAAGATGAAGAGCCAGACGGGACAGGTGGGTGA 476 

I I M I I I I I I I I I II I I I I I I I I I I I I I I I III 

Db 1002 GCGGTGCGGCTTGCCTCCCCAATTGAAAGAGATGAAAAGCCAGGAATCGGCTGCAGGTTC 1061 

Qy 477 GAAGCAATCGCTGAAGTGTGAGGCAGCAGCCGGTAATCCCCAGCCTTCCTACCGTTGGTT 536 

I I I I II I Ml I I I II III I II I I I II 

Db 1062 CAAACTAGTCCTTCGGTGTGAAACCAGTTCTGAATACTCCTCTCTCAGATTCAAGTGGTT 1121 



Qy 


537 


Db 


1122 


Qy 


594 


Db 


1182 


Qy 


654 


Db 


1242 


Qy 


714 


Db 


1302 


Qy 


768 


Db 


1362 


Qy 


828 


Db 


1422 


Qy 


879 


Db 


1482 



™ ^v^^j. i ^wwj-L Dy^s 

1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 I I II I II I II 

CAAGAATGGGAATGAATTGAATCGAAAAAACAAACCACAAAATATCAAGATACAAAAAAA 1181 

CAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGT 653 

I I I I I I I I I I I I I I I I I I III II II I I I I I II I I 

GCCAGGGAAGTCAGAACTTCGCATTAACAAAGCATCACTGGCTGATTCTGGAGAGTATAT 1241 

CTGCGAGGCCGAGAACATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAG 713 
< < I > I III I I I II M M I I II I III 

GTGCAAAGTGATCAGCAAATTAGGAAATGACAGTGCCTCTGCCAATATCACCATCGTGGA 1301 

CGTGAGCACCACCCTGTCATCC TGGTCGGGGCACGCCCGGAAGTGCAACGAGAC 767 

I I > I I I I I I 1 I I 1 I I I I I I I III, 

ATCAAACGCTACATCTACATCCACCACTGGGACAAGCCATCTTGTAAAATGTGCGGAGAA 1361 

AGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAGCT 827 
I II II II II MIIIMI I MM I I I I I I II 

GGAGAAAACTTTCTGTGTGAATGGAGGGGAGTGCTTCATGGTGAAAGACCTTTCAAACCC 1421 

CTC -CTGCAAGTGTCCTGTGGGATACACCGGGGACAGGTGTCAGCAGTTCGC 878 

III I M I II I I II I I I I I I I I I I I I I III, 

CTCGAGATACTTGTGCAAGTGCCCAAATGAGTTTACTGGTGATCGCTGCCAAAACTACGT 1481 

AATGGTCAACTTCTCC 894 
Mill II Mill I 



Search completed: January 14, 2004, 14:11:42 
Job time : 352.751 sees 



